(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(19) World Intellectual Property 
Organization 
International Bureau 

(43) International Publication Date 
22 July 2004 (22.07.2004) 




PCT 



(10) International Publication Number 

WO 2004/061116 A2 



(51) International Patent Classification 7 : 



C12Q 



(21) Internationa] Application Number: 

PCT/US2003/034082 

(22) International Filing Date: 24 October 2003 (24.10.2003) 



(25) Filing Language: 



(26) Publication Language: 



English 



English 



(74) Agents: GOLIAN, Paul, D. et ah; Bristol-Myers Squibb 
Company, P.O. Box 4000, Princeton, NJ 08543-4000 (US). 

(81) Designated States (national): AE, AG, AL, AM, AT, AU, 
AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CO, CR, CU, 
CZ, DE, DK, DM, DZ, EC, EE, EG, ES, FT, GB, GD, GE, 
GH, GM, HR, HU, ID, IL, IN, IS, JP, KE, KG, KP, KR, 
KZ, LC, LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, 
MN, MW, MX, MZ, NT, NO, NZ, OM, PG, PH, PL, PT, 
RO, RU, SC, SD, SE, SG, SK, SL, SY, TJ, TM, TN, TR, 
TT, TZ, UA, UG, UZ, VC, VN, YU, ZA, ZM, ZW. 



(30) Priority Data: 

10/321,188 



17 December 2002 (17.12.2002) US 



(71) Applicant: BRISTOL-MYERS SQUIBB COMPANY 

[US/US]; P. O. Box 4000, Route 206 and Provinceline 
Road, Princeton, NJ 08543-4000 (US). 

(72) Inventors: BASCH, Jonathan, David; 216 Wellington 
Road, DeWitt, NY 08543 (US). CHIANG, Shu-Jen; 4884 
Edgeworth Drive, Manlius, NY 14104 (US)! LIU, Suo- 
Win; 4997 Firethorn Circle, Manlius, NY 13104 (US). 
NAYEEM, Akbar; 42 Quince Circle, Newtown, PA 1 8940 
(US). SUN, Yuhua; 219 Deerfield Road, Apt. 2, East Syra- 
cuse, NY 13057 (US). YOU, Li; 6316 Westerly Terrace, 
Jamesville, NY 13078 (US): 



(84) Designated States (regional): ARIPO patent (GH, GM, 
KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZM, ZW), 
Eurasian patent (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), 
European patent (AT, BE, BG, CH, CY, CZ, DE, DK, EE, 
ES, FI, FR, GB, GR, HU, IE, IT, LU, MC, NL, PT, RO, 
SE, SI, SK, TR), OAPI patent (BF, BJ, CF, CG, CI, CM, 
GA, GN, GQ, GW, ML, MR, NE, SN, TD, TG). 

Published: 

— without international search report and to be republished 
upon receipt of that report 

For two-letter codes and other abbreviations, refer to the "Guid- 
ance Notes on Codes and Abbreviations" appearing at the begin- 
ning of each regular issue of the PCT Gazette. 



(54) Title: COMPOSITIONS AND METHODS FOR H YDROXIL ATING EPOTHTLONES 



< 



o 




SC15847 biotransformation 




eputhilune B 



epothilone P 



O (57) Abstract: Isolated nucleic acid sequences and polypeptides encoded thereby for epothilone B hydroxylase and mutants and 
variants thereof and a ferredoxin located downstream from the epothilone B hydroxylase gene are provided. Also provided are 
^) vectors and cells containing these vectors. In addition, methods for producing recombinant microorganisms, methods for using 
>^ these recombinant microorganism to produce hydroxyalkyl -bearing epothilones and an epothilone analog produced by a mutant of 
epothilone B hydroxylase are provided. 



3NSDOCID: *WO 20O4O61 1 16A2J_> 



WO 2004/061116 



PCT/US2003/034082 



COMPOSITIONS AND METHODS FOR HYDROXYLATING 



EPOTHILONES 



5 



Field of the invention 

The present invention relates to isolated nucleic acids sequences and 
polypeptides encoded thereby for epothilone B hydroxylase and mutants and variants 
thereof, and a ferredoxin located downstream from the epothilone B hydroxylase 

10 gene. The present invention also relates to recombinant microorganisms expressing 
epothilone B hydroxylase or a mutant or variant thereof and/or ferredoxin which are 
capable of hydroxylating small organic molecule compounds, such as epothilones, 
having a terminal alkyl group to produce compounds having a terminal hydroxyalkyl 
group. Also provided are methods for recombinantly producing such microorganisms 

15 as well as methods for using these recombinant microorganisms in the synthesis of 
compounds having a terminal hydroxylalkyl group. The compositions and methods 
of the present invention are useful in preparation of epothilones having a variety of 
utilities in the pharmaceutical field. A novel epothilone analog produced using a 
mutant of epothilone B hydroxylase of the present invention is also described. 



Background of the Invention 

Epothilones are macrolide compounds that find utility in the pharmaceutical 
field. For example, epothilones A and B having the structures: 



have been found to exert microtubule-stabilizing effects similar to paclitaxel 
(TAXOL®) and hence cytotoxic activity against rapidly proliferating cells, such as, 




25 



Epothilone A 
Epothilone B 



R=H 



R=Me 



- 1 - 



,2004061 11 6A2_I_> 



WO 2004/061116 PCTYUS2003/034082 

tumor cells or cells associated with other hyperproliferative cellular diseases, see 
Bollag etaU Cancer Res ., Vol. 55, No. 11, 2325-2333 (1995). 

Epothilones A and B are natural anticancer agents produced by Sorangium 
cellulosum that were first isolated and characterized by Hofle et aL, DE 4138042; WO 
5 93/10121; Angew. Chem. Int. Ed. Engl . Vol. 35, Nol3/14, 1567-1569 (1996); and J. 
Antibiot., Vol. 49, No. 6, 560-563 (1996). Subsequently, the total syntheses of 
epothilones A and B have been published by Balog et al, Angew. Chem. Int. Ed. 
Engl ., Vol. 35, No. 23/24, 2801-2803, 1996; Meng et aL, J. Am. Chem. Soc . Vol. 
119, No. 42, 10073-10092 (1997); Nicolaou etaL, J. Am. Chem. Soc . Vol. 119, No. 
10 34, 7974-7991 (1997); Schinzer et aL, Angew. Chem. Int. Ed Eng .. Vol.. 36, No. 5, 
523-524 (1997); and Yang et aL, Angew. Chem. Int. Ed. Engl. . Vol. 36, No. 1/2, 
166-168, 1997. WO 98/25929 disclosed the methods for chemical synthesis of 
epothilone A, epothilone B, analogs of epothilone and libraries of epothilone analogs. 
The structure and production from Sorangium cellulosum DSM 6773 of epothilones 
15 C, D, E, and F was disclosed in WO 98/22461 . Figure 1 provides a diagram of the 
biotransformation as described in WO 00/39276 of epothilone B to epothilone F in 
Actinomycetes species strain SC15847 (ATCC PT-1043), subsequently identified as 
Amycolatopsis orientalis. 

Cytochrome P450 enzymes are found in prokaryotes and eukaryotic cells and 
20 have in common a heme binding domain which can be distinguished by an 

absorbance peak at 450 nm when complexed with carbon monoxide. Cytochrome 
P450 enzymes perform a broad spectrum of oxidative reactions on primarily 
hydrophobic substrates including aromatic and benzylic rings, and alkanes. In 
prokaryotes they are found as detoxifying systems and as a first enzymatic step in 
25 metabolizing substrates such as toluene, benzene and camphor. Cytochrome P450 

genes have also been found in biosynthetic pathways of secondary metabolites such as 
nikkomycin in Streptomyces tendae (Rmntnzr, C. et al, 1999, Mol. Gen. Genet. 262: 
102-114), doxorubicin (Dickens, M.L, Strohl, W.R., 1996, J. Bacteriol, 178: 3389- 
3395) and in the epothilone biosynthetic cluster of Sorangium cellulosum (Julien, B. 
30 et al., 2000, Gene, 249: 153-160). With a few exceptions, the cytochrome P450 
systems in prokaryotes are composed of three proteins; a ferredoxin NADH or 
NADPH dependent reductase, an iron-sulfur ferredoxin and the cytochrome P450 
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enzyme (Lewis, D.F., Hlavica, P., 2000, Biochim. Biophys. Acta., 1460: 353-374). 
Electrons are transferred from ferredoxin reductase to the ferredoxin and finally to the 
cytochrome P450 enzyme for the splitting of molecular oxygen. 

5 Summary of the Invention 

An object of the present invention is to provide isolated nucleic acid sequences 
encoding epothilone B hydroxylase and variants or mutants thereof and isolated 
nucleic acid sequences encoding ferredoxin or variants or mutants thereof. 

Another object of the present invention is to provide isolated polypeptides 
10 comprising amino acid sequences of epothilone B hydroxylase and variants or 
mutants thereof and isolated polypeptides comprising amino acid sequences of 
ferredoxin and variants or mutants thereof. 

Another object of the present invention is to provide structure coordinates of 
the homology model of the epothilone B hydroxylase. The structure coordinates are 
15 listed in Appendix 1. This model of the present invention provides a means for 

designing modulators of a biological function of epothilone B hydroxylase as well as 
additional mutants of epothilone B hydroxylase with altered specificities. 

Another object of the present invention is to provide vectors comprising 
nucleic acid sequences encoding epothilone B hydroxylase or a variant or mutant 
20 thereof and/or ferredoxin or a variant or mutant thereof. In a preferred embodiment, 
these vectors further comprise a nucleic acid sequence encoding ferredoxin. 

Another object of the present invention is to provide host cells comprising a 
vector containing a nucleic acid sequence encoding epothilone B hydroxylase or a 
variant or mutant thereof and/or ferredoxin or a variant or mutant thereof. 
25 Another object of the present invention is to provide a method for producing 

recombinant microorganisms that are capable of hydroxylating compounds, and in 
particular epothilones, having a terminal alkyl group to produce compounds having a 
terminal hydroxyalkyl group. 

Another object of the present invention is to provide microorganisms produced 
30 recombinantly which are capable of hydroxylating compounds, and in particular 

epothilones, having a terminal alkyl group to produce compounds having a terminal 
hydroxyalkyl group. 
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Another object of the.present invention is to provide methods for 
hydroxylating compounds in these recombinant microorganisms. In particular, the 
present invention provides a method for the preparation of hydroxyalkyl-bearing 
epothilones, which compounds find utility as antitumor agents and as starting 
5 materials in the preparation of other epothilone analogs. 

Yet another object of the present invention is to provide a compound of 
Formula A: 




10 referred to herein as 24-OH epothilone B or 24-OH EppB, as well as compositions 

and methods for production of compositions comprising the compound of Formula A. 

Brief Description of the Figures 

Figure 1 provides a schematic of the biotransformation as set forth in WO 
15 00/39276, U.S. Application Serial No. 09/468,854, filed December 21, 1999, of 

epothilone B to epothilone F by Amycolatopsis orientalis strain SC15847 (PTA1043). 

Figure 2 shows the nucleic acid sequence alignments of SEQ ID NO:5 through 
SEQ ID NO: 22 used to design the PCR primers for cloning of the nucleic acid 
sequence encoding epothilone B hydroxylase. 
20 Figure 3 shows the sequence alignment between epothilone B hydroxylase 

(SEQ ID NO:2) and EryF (PDB code 1 JIN chain A; SEQ ID NO:76). The asterisks 
indicate sequence identities, the colons (:) similar residues. 

Figure 4 provides a homology model of epothilone B hydroxylase based upon 
sequence alignment with EryF as shown in Figure 3. 
25 Figure 5 shows an energy plot of the epothilone B hydroxylase model 

(indicated by dashed line) relative to EryF (PDB code 1 JIN; indicated by solid line). 
An averaging window size of 51 residues was used, i.e., the energy at a given residue 
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position is calculated as the average of the energies of the 51 residues in the sequence 
that lie with the given residue at the central positions. 



Detailed Description of the Invention 



5 



The present invention relates to isolated nucleic acid sequences and 
polypeptides and methods for obtaining compounds with desired substituents at a 
terminal carbon position. In particular, the present invention provides compositions 
and methods for the preparation of hydroxyalkyl-bearing epothilones, which 

10 compounds find utility as antitumor agents and as starting materials in the preparation 
of other epothilone analogs. 

The term "epothilone/' as used herein, denotes compounds containing an 
epothilone core and a side chain group as defined herein. The term "epothilone core," 
as used herein, denotes a moiety containing the core structure (with the numbering of 

15 ring system positions used herein shown): 




wherein the substituents are as follows: 



Q is selected from the group consisting of 



20 




and 



WisOorNR 6 ; 

X is selected from the group consisting of O, H and OR7; 
M is O, S, NR 8 , CR9R10; 

Bi and Ba are selected from the group consisting of ORn, OCOR12; 
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R1-R5 and R12-R17 are selected from the group consisting of H, alkyl, 
substituted alkyl, aryl, and heterocyclo, and wherein Ri and R2 are alkyl they can be 
joined to form a cycloalkyl; 

R6 is selected from the group consisting of H, alkyl, and substituted alkyl; 
5 R 7 and Rn are selected from the group consisting of H, alkyl, substituted 

alkyl, trialkylsilyl, alkyldiarylsilyl and dialkylarylsilyl; 

R 8 is selected from the group consisting of H, alkyl, substituted alkyl, R 13 C=0, 
Ri 4 OC=0 and Ri 5 S0 2 ; and 

R9 and Rio are selected from the group consisting of H, halogen, alkyl, 
10 substituted alkyl, aryl, heterocyclo, hydroxy, Ri 6 C=0, and R n OC=0. 

The term "side chain group" refers to substituent G as defined above for 
Epothilone A or B or Gi and G2 as shown below. 
Gi is the following formula V 

HO-CH 2 -(A 1 V(Q) m -(A 2 ) 0 (V), 

15 and 

G 2 is the following formula VI 

CH 3 -(A0n^Q) m -(A 2 ) o (VI), 

where 

Ai and A 2 are independently selected from the group of optionally substituted 
20 C1-C3 alkyl and alkenyl; 

Q is optionally substituted ring system containing one to three rings and at 
least one carbon to carbon double bond in at least one ring; and 

n, m, and o are integers independently selected from the group consisting of 
zero and 1, where at least one of m, n or o is 1 . 
25 The term "terminal carbon" or "terminal alkyl group" refers to the terminal 

carbon or terminal methyl group of the moiety either directly bonded to the epothilone 
core at position 15 or to the terminal carbon or terminal alkyl group of the side chain 
group bonded at position 15. It is understood that the term "alkyl group" includes 
alkyl and substituted alkyl as defined herein. 
30 The term "alkyl" refers to optionally substituted, straight or branched chain 

saturated hydrocarbon groups of 1 to 20 carbon atoms, preferably 1 to 7 carbon atoms. 
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The expression "lower alkyl" refers to optionally substituted alkyl groups.of 1 to 4 
carbon atoms. 

The term "substituted alkyl" refers to an alkyl group substituted by, for 
example, one to four substituents, such as, halo, trifluoromethyl, trifluoromethoxy, 

5 hydroxy, alkoxy, cycloalkyloxy, heterocyclooxy, oxo, alkanoyl, aryloxy, alkanoyloxy, 
amino, alkylamino, arylamino, aralkylamino, cycloalkylamino, heterocycloamino, 
disubstituted amines in which the 2 amino substituents are selected from alkyl, aryl or 
aralkyl, alkanoylamino, aroylamino, aralkanoylamino, substituted alkanoylamino, 
substituted arylamino, substituted aralkanoylamino, thiol, alkylthio, arylthio, 

10 aralkylthio, cycloalkylthio, heterocyclothio, alkylthiono, arylthiono, aralkylthiono, 
alkylsulfonyl, arylsulfonyl, aralkylsulfonyl, sulfonamido (e.g. SO2NH2), substituted 
sulfonamido, nitro, cyano, carboxy, carbamyl (e.g. CONH 2 ), substituted carbamyl 
(e.g. CONH alkyl, CONH aryl, CONH tralkyl or cases where there are two 
substituents on the nitrogen selected from alkyl, aryl or aralkyl), alkoxycarbonyl, aryl, 

15 substituted aryl, guanidino and heterocyclos, such as, indolyl, imidazolyl, furyl, 
thienyl, thiazolyl, pyrrolidyl, pyridyl, pyrimidyl and the like. Where noted above 
where the substituent is further substituted it will be with halogen, alkyl, alkoxy, aryl 
or aralkyl. 

In accordance with one aspect of the present invention there are provided 
20 isolated polynucleotides that encode epothilone B hydroxylase, an enzyme capable of 
hydrdxylating epothilones having a terminal alkyl group to produce epothilones 
having a terminal hydroxy alkyl group. 

In accordance with another aspect of the present invention there are provided 
isolated polynucleotides that encode a ferredoxin, the gene for which is located 
25 downstream from the epothilone B hydroxylase gene. Ferredoxin is a protein of the 
cytochrome P450 system. 

By "polynucleotides", as used herein, it is meant to include any form of DNA 
or RNA such as cDN . >r geno^ c DNA or mRNA, respectively, encoding these 
enzymes or an active ment :sof which are obtained by cloning or produced 
30 synthetically by well known chemical techniques. DNA may be double- or single- 
stranded. Single-stranded DNA may comprise the coding or sense strand or the non- 
coding or antisense strand. Thus, the term polynucleotide also includes 
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polynucleotides exhibiting at least 60% or more, preferably at least 80%, homology to 
sequences disclosed herein, and which hybridize under stringent conditions to the 
above-described polynucleotides. As used herein, the term "stringent conditions" 
means hybridization conditions of 60°C at 2xSSC buffer. More preferred are isolated 
5 nucleic acid molecules capable of hybridizing to the nucleic acid sequence set forth in 
1, 30, 32, 34, 36, 37, 38, 39, 40, 41, 42, 60, 62, 64, 66, 68, 70, 72, or 74 or SEQ ID 
NO:3, or to the complementary sequence of the nucleic acid sequence set forth in 
SEQ ID NO:l, 30, 32, 34, 36, 37, 38, 39, 40, 41, 42, 60, 62 ,64, 66, 68, 70, 72 ,or 74 
or SEQ ID NO:3, under hybridization conditions of 3X SSC at 65°C for 16 hours, 

10 and which are capable of remaining hybridized to the nucleic acid sequence set forth 
in SEQ ID NO:l, 30, 32, 34, 36, 37, 38, 39, 40, 41, 42, 60, 62, 64, 66, 68, 70, 72 or 74 
or SEQ ID NO:3, or to the complementary sequence of the nucleic acid sequence set 
forth in SEQ ID NO:l, 30, 32, 34, 36, 37, 38, 39, 40, 41 or 42, 60, 62, 64, 66, 68, 70, 
72 or 74 or SEQ ID NO:3, under wash conditions of 0.5X SSC, 55°C for 30 minutes. 

15 In one embodiment, a polynucleotide of the present invention comprises the 

genomic DNA depicted in SEQ ID NO:l or a homologous sequence or fragment 
thereof which encodes a polypeptide having similar activity to that of this epothilone 
B hydroxylase. Alternatively, a polynucleotide of the present invention may comprise 
the genomic DNA depicted in SEQ ID NO: 3 or a homologous sequence or fragment 

20 thereof which encodes a polypeptide having similar activity to this ferredoxin. Due to 
the degeneracy of the genetic code, polynucleotides of the present invention may also 
comprise other nucleic acid sequences encoding this enzyme and derivatives, variants 
or active fragments thereof. 

The present invention also relates to variants of these polynucleotides which 

25 may be naturally occurring, i.e., present in microorganisms such as Amycolatopsis 
orientalis and Amycolata autotrophica, or in soil or other sources from which nucleic 
acids can be isolated, or mutants prepared by well known mutagenesis techniques. 
Exemplary valiants polynucleotides of the present invention are depicted in SEQ ID 
NO: 36-42. 

30 By "mutants" as used herein it is meant to be inclusive of nucleic acid 

sequences with one or more point mutations, or deletions or additions of nucleic acids 
as compared to SEQ ID NO: 1 or 3, but which still encode a polypeptide or fragment 
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with similar activity to the polypeptides encoded by SEQ ID NO: 1 or 3. In a 
preferred embodiment, mutations are made which alter the substrate specificity and/or 
yield of the enzyme. A preferred region of mutation with respect to the epothilone B 
hydroxylase gene is that region of the nucleic acid sequence coding for the 
5 approximately 113 amino acids residues comprising the active site of the enzyme. 
Also preferred are mutants encoding a polypeptide with at least one amino acid 
substitution at amino acid position GLU31, ARG67, ARG88, ELE92, ALA93, 
VAL106, ILE130, ALA140, MET176, PHE190, GLU 231, SER294, PHE237, or 
ILE365 of SEQ ID NO:l. Exemplary polynucleotide mutants of the present invention 
10 are depicted in SEQ ID NO: 30, 32, 34, 60, 62, 64, 66, 68, 70, 72 and 74. 

Cloning of the nucleic acid sequence of SEQ ID NO: 1 encoding epothilone B 
hydroxylase was performed using PCR primers designed by aligning the nucleic acid 
sequences of six cytochrome P450 genes from bacteria. The following cytochrome 
P450 genes were aligned: 
15 Sequence 1: Locus: STMSUACB; Accession number: M32238; Reference: 

Omer, C.A., J. Bacteriol. 172: 3335-3345 (1990) 
Sequence 2: Locus: STMSUBCB; Accession number: M32239; Reference: 

Omer, C.A., J. Bacteriol. 172: 3335-3345 (1990) 
Sequence 3: Locus: AB018074 (formerly STMORFA); Accession number: 
20 AB018074; Reference: Ueda, K., J. Antibiot. 48: 638-646 (1995) 

Sequence 4: Locus: SSU65940; Accession number: U65940; Reference: 

Motamedi, H., J. Bacteriol. 178: 5243-5248 (1996) 
Sequence 5: Locus: STMOLEP; Accession number: L37200; Reference: 

Rodriguez, A.M., FEMS Microbiol. Lett 127: 117-120 (1995) 
25 Sequence 6: Locus: SERCP450A; Accession number: M83 1 10; Reference: 

Andersen, J.F. and Hutchinson, C.R., J. Bacteriol. 174: 725-735 
(1992) 

Alignments were performed using an implementation of the algorithm of 
Myers, E.W. and W. Miller. 1988. CABIOS 4: 1, 1 1-17., the Align program from 
30 Scientific and Educational Software (Durham, North Carolina, USA); Three highly 
conserved regions were identified in the I-helix, containing the oxygen binding 
domain, in the K-helix, and spanning the B-bulge and L-helix containing the 
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conserved heme binding domain. Primers were designed to the three conserved 
regions identified in the alignment. Primers P450-1+ (SEQ ID NO:23) and P450-la + 
(SEQ ID NO:24) were designed from the I helix, Primer P450-2 + (SEQ ID NO:25) 
was designed from the B-Bulge and L-helix region and Primer P450-3~( SEQ ID 
5 NO:27) was designed as the reverse complement to the heme binding protein. 

Genomic fragments were then amplified via polymerase chain reaction (PCR). 
After PCR amplification, the reaction products were separated by gel electrophoresis 
and fragments of the expected size were excised. The DNA was extracted from the 
agarose gel slices using the Qiaquick gel extraction procedure (Qiagen, Santa Clarita, 

10 California, USA). The fragments were then cloned into the PCRscript vector 
(Stratagene, La Jolla, California, USA) using the PCRscript Amp cloning kit 
(Stratagene). Colonies containing inserts were picked to 1-2 ml of LB broth with 100 
|xg/ml ampicillin, 30-37°C, 16-24 hours, 230-300 rpm. Plasmid isolation was 
performed using the Mo Bio miniplasmid prep kit (Mo Bio, Solano Beach, California, 

15 US A). This plasmid DNA was used as a PCR and sequencing template and for 
restriction digest analysis. 

The cloned PCR products were sequenced using the Big-Dye sequencing kit 
from Applied Biosystems, (Foster City, California, USA) and were analyzed using the 
ABB 10 sequencer (Applied Biosystems, Foster City, California, USA). The sequence 

20 of the inserts was used to perform a TblastX search, using the protocol of Altschul, 
S.F, et a/., MoLBioL 215:403-410 (1990), of the non-redundant protein database. 
Unique sequences having a significant similarity to known cytochrome P450 proteins 
were retained. Using this approach, a total of nine different P450 sequences were 
identified from SC 15847, seven from the genomic DNA template and two from the 

25 cDNA. Two P450 sequences were found in common between the DNA and cDNA 
templates. Of the fifty cDNA clones analyzed, two sequences were predominant, 
with twenty clones each. These two genes were then cloned from the genomic DNA. 

The nucleic acid sequence of the genomic DNA was determined using the 
Big-Dye sequencing system (Applied Biosystems) and analyzed using an ABI310 

30 sequencer. This sequence is depicted in SEQ ID NO: 1. An open reading frame 

coding for a protein of 404 amino acids and a predicted molecular weight of 44.7 kDa 
was found within the cloned Bgin fragment. The deduced amino acid sequence of 
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this polypeptide is depicted in SEQ ID NO: 2. The amino acid sequence of this 
polypeptide was found to share 51% identity with the NikF protein of Streptomyces 
tendae (Bruntner, C. et al, 1999, Mol. Gen. Genet. 262: 102-1 14) and 48% identity 
with the Sca-2 protein of S. carbophilus (Watanabe, I. Et al, 1995, Gene 163: 81-85). 
5 Both of these enzymes belong to the cytochrome P450 family 105. The invariable 
cysteine found in the heme-binding domain of all cytochrome P450 enzymes is found 
at residue 356. This gene for epothilone B hydroxylase has been named ebh. The 
ATG start codon of a putative ferredoxin gene of 64 amino acids is found nine 
basepairs downstream from the stop codon of ebh. This enzyme was found to share 
10 50% identity with ferredoxin genes of S. griseoulus (O'Keefe, D.P., et al, 1991, 

Biochemistry 30: 447-455) and S. noursei (Brautaset, T., et al, 2000, Chem. Biol. 7: 
395-403). The nucleic acid sequence encoding this ferredoxin is depicted in SEQ ID 
NO:3 and the amino acid sequence for this ferredoxin polypeptide is depicted in SEQ 
IDNO:4. 

15 The ebh gene sequence was also used to isolate variant cytochrome P450 

genes from other microorganisms. Exemplary variant polynucleotides ebh43491 y 
eZ?M4930, ebh53630, ebh53550, ebh39444, ebh43333 and e£>/z35165 of the present 
invention and the species from which they were isolated are depicted in Table 1 
below. The nucleic acid sequences for these variants are depicted in SEQ ID NO:36- 

20 42, respectively. 

Table 1: Variant polynucleotides 



ATCCID 


Species 


ebh gene designation 


43491 


Amvcolatovsis orientalis 


eW/43491 


14930 


Amvcolatovsis orientalis 


e£/il4930 


53630 


Amvcolatovsis orientalis 


e£/*53630 


53550 


Amvcolatovsis orientalis 


C&A53550 


39444 


Amvcolatovsis orientalis 


ebh39444 


43333 


Amvcolatovsis orientalis 


eWi43333 


35165 


Amvcolatovsis orientalis 


efc/t35165 



The amino acid sequences encoded by the exemplary variants eWi43491, 
efc/zl4930, eWi53630, efc/x53550, ebh39444, <?Z>/t43333 and *fc/t35165 are depicted in 

- 11 - 



2004061 11 6A2 I > 



WO 2004/061116 



PCT/US2003/034082 



SEQ ID NO:43-49, respectively. Table 2 provides a summary of the amino acid 
substitutions of these exemplary variants. 
Table 2: Amino acid Substitutions 



Position 


ebh 


Substitution 


ebh variant 


100 


Gly 


Ser 


ebhl4930, ebh43333, ebh53550, ebh4349\ 


101 


Lys 


Arg 


ebh\A930 


130 


He 


Leu . 


ebhU930 


192 


Ser 


Gin 


ebhW93Q 


224 


Ser 


Thr 


ebh\493Q, ebIA3333, eM53550, ebh.43491 


285 


lie 


Val 


ebh\4930, ebh43333, ebh53550, ebh4349\ 


69 


Ser 


Asn 


eM43333 


256 


Val 


Ala 


eM43333, <?fc/i53550, ebh4349\ 


93 


Ala 


Ser 


ebh53550 


326 


Asp 


Glu 


<?£/i53550, ebh4349\ 


333 


Thr 


Ala 


ebh53550, ebh4349\ 


133 


Leu 


Met 


ebh4349\ 


398 


His 


Arg 


ebh39444 



5 Mutations were also introduced into the coding region of the ebh gene to 

identify mutants with improved yield, and/or rate of bioconversion and/or altered 
substrate specificity. Exemplary mutant nucleic acid sequences of the present 
invention are depicted in SEQ ID NO:30, 32, 34, 60, 62, 64, 66, 68, 70, 72 and 74. 

The nucleic acid sequence of SEQ ID NO:30 encodes a mutant ebh25-l which 

10 exhibits altered substrate specificity. Plasmid pANT849eZ?/z25~l containing this 
mutant gene was deposited and accepted by an International Depository Authority 
under the provisions of the Budapest Treaty. The deposit was made on November 21, 
2Q02 to the American Type Culture Collection at 10801 University Boulevard in 
Manassas, Virginia 201 10-2209. The ATCC Accession Number is PTA-4809. All 

15 restrictions upon public access to this plasmid will be irrevocably removed upon 
granting of this patent application. The Deposit will be maintained in a public 
depository for a period of thirty years after the date of deposit or five years after the 
last request for a sample or for the enforceable life of the patent, whichever is longer. 
The above-referenced plasmid was viable at the time of the deposit. The deposit will 

20 be replaced if viable samples cannot be dispensed by the depository. 

This S. lividans transformant identified in the screening of mutation 25 
(primers NPB29-mut25f (SEQ ID NO:58) and NPB29-mut25r (SEQ ID NO:59)) was 
found to produce a product with a different HPLC elution time than epothilone B or 
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epothilone F. A sample of this unknown was analyzed by LC-MS and was found to 
have a molecular weight of 523 (M.W.), consistent with a single hydroxylation of 
epothilone B. Plasmid DNA was isolated from the S. lividans culture and used as a 
template for PCR amplification using primers NPB29-6f (SEQ ID NO:28) and 
5 NPB29-7r (SEQ ID NO:29) (see Example 17). The expected fragment was obtained 
and sequenced using the Big-Dye sequencing system. The ebh25-l mutant was found 
to have two mutations resulting in changes in the amino acid sequence of the protein, 
asparagine 195 is changed to serine and serine 294 is changed to proline. The position 
targeted for mutation at codon 238 was found to have a two nucleotide change, which 
10 did not result in a change of the amino acid sequence of the protein. The amino acid 
sequence of the mutant polypeptide encoded by SEQ ID NO:30 is depicted in SEQ ID 
NO:31. 

The nucleic acid sequence of SEQ ID NO:32 encodes a mutant ebhl0-53, 
which exhibits improved bioconversion yield. This 5. lividans transformant identified 

15 in the screening of mutation 10 (primers NPB29-mutl0f (SEQ ID NO:54) and 

NPB29-mutl0r (SEQ ID NO:55)) produced a greater yield of epothilone F. Plasmid 
DNA was isolated from the S. lividans culture and used as a template for PCR 
amplification using primers NPB29-6f (SEQ ID NO:28) and NPB29-7r (SEQ ID 
NO:29)(see Example 16). The expected fragment was obtained and sequenced using 

20 the Big-Dye sequencing system. The ebhlO-53 mutant was found to have two 

mutations resulting in changes in the amino acid sequence of the protein, glutamic 
acid 231 is changed to arginine and phenylalanine 190 is changed to tyrosine. The 
position 231 was the target of the mutagenesis, the change at residue 190 is an 
inadvertent change that is an artifact of the mutagenesis procedure. The amino acid 

25 sequence of the mutant polypeptide encoded by SEQ ID NO:32 is depicted in SEQ ID 
NO:33. 

The nucleic acid sequence of SEQ ID NO:34 encodes a mutant <?W*24-16, 
which also exhibits improved bioconversion yield. This S. lividans transformant, 
ebk2A-l& identified in the screening of mutation 24 (primers NPB29-mut24f (SEQ ID 
30 NO:56) and NPB29-mut24r (SEQ ID NO:57) also produced a greater yield of 

epothilone F. Plasmid DNA was isolated from the S. lividans culture and used as a 
template for PCR amplification using primers NPB29-6f (SEQ ID NO:28) and 
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NPB29-7r (SEQ ID NO:29). The expected fragment was obtained and sequenced 
using the Big-Dye sequencing system. The ebh24-16 mutant was found to have two 
mutations resulting in changes in the amino acid sequence of the protein, 
phenylalanine 237 is changed to alanine and isoleucine 92 is changed to valine. The 
5 position 237 was the target of the mutagenesis, the change at residue 92 is an 

inadvertent change that is an artifact of the mutagenesis procedure. The amino acid 
sequence of the mutant polypeptide encoded by SEQ ID NO:34 is depicted in SEQ ID 
NO:35. 

The nucleic acid sequence of SEQ ID NO:60 encodes a mutant ebh24-16dS, 

10 which also exhibits improved byconversion yield. This S. rimosus transformant, 

ebh24-16dS identified in the screening of mutation 59 (primer NPB29mut59 (SEQ ID 
NO:70)) also produced a greater yield of epothilone R Plasmid DNA was isolated 
from the S. rimosus culture and used as a template for PCR amplification using 
primers NPB29-6f (SEQ ID NO:28) and NPB29-7r (SEQ ID NO:29). The expected 

15 fragment was obtained and sequenced using the Big-Dye sequencing system. The 
eM24-16d8 mutant was found to have one mutation resulting in a change in the 
amino acid sequence of the protein, arginine 67 is changed to glutamine. This change 
is an artifact of the mutagenesis procedure. The amino acid sequence of the mutant 
polypeptide encoded by SEQ ID NO:60 is SEQ ID NO:61. 
• 20 The nucleic acid sequence of SEQ ID NO:62 encodes a mutant eWi24-16cl 1, 

which also exhibits improved bioconversion yield. This S. rimosus transformant, 
ebh24-l6cl 1 identified in the screening of mutation 59 (primer NPB29mut59 (SEQ 

TD NO:70)) also produced a greater yield of epothilone F. Plasmid DNA was isolated 

from the 5. rimosus culture and used as a template for PCR amplification using 

25 primers NPB29-6f (SEQ ID NO:28) and NPB29-7r (SEQ ID NO:29). The expected 
fragment was obtained and sequenced using the Big-Dye sequencing system. The 
ebh24-16cll mutant was found to have two additional mutations resulting in changes 
in the amino acid sequence of the protein, alanine 93 is changed to glycine and 
isoleucine 365 is changed to threonine. The position 93 is the target of the 

30 mutagenesis, the change at 365 is an artifact of the mutagenesis procedure. The 
amino acid sequence of the mutant polypeptide encoded by SEQ ID NO:62 is 
depicted in SEQ ID NO:63. 

i 
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The nucleic acid sequence of SEQ ID NO:64 encodes a mutant ebh24-16-16, 
which also exhibits improved bioconversion yield. This S. rimosus transformant, 
ebh24-\6-\6 identified in the screening of random mutants of ebh24-\6 also 
produced a greater yield of epothilone F. Plasmid DNA was isolated from the S. 

5 rimosus culture and used as a template for PCR amplification using primers NPB29- 
6f (SEQ ID NO:28) and NPB29-7r (SEQ ID NO:29). The expected fragment was 
obtained and sequenced using the Big-Dye sequencing system. The e/?/x24-16-16 
mutant was found to have one additional mutation resulting in changes in the amino 
acid sequence of the protein, valine 106 is changed to alanine. The amino acid 

10 sequence of the mutant polypeptide encoded by SEQ ID NO:64 is depicted in SEQ ID 
NO:65. 

The nucleic acid sequence of SEQ ID NO: 66 encodes a mutant ebh24- 16-14, 
which also exhibits improved bioconversion yield. This S. rimosus transformant, 
ebh24- 16-74 identified in the screening of random mutants of ebh24-l6 also 

15 produced a greater yield of epothilone F. Plasmid DNA was isolated from the S. 

rimosus culture and used as a template for PCR amplification using primers NPB29- 
6f (SEQ ID NO:28) and NPB29-7r (SEQ ID NO:29). The expected fragment was 
obtained and sequenced using the Big-Dye sequencing system. The ebh24~ 16-74 
mutant was found to have one additional mutation resulting in changes in the amino 

20 acid sequence of the protein, arginine 88 is changed to histidine. The amino acid 
sequence of the mutant polypeptide encoded by SEQ JD NO:66 is SEQ ED NO:67. 
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The nucleic acid sequence of SEQ ID NO:68 encodes a mutant ebh24-M18 9 
which also exhibits improved bioconversion yield. This S. rimosus transformant, 
ebhM-l$ identified in the screening of random mutants of ebh also produced a 
greater yield of epothilone F. Plasmid DNA was isolated from. the S. rimosus culture 
5 and used as a template for PCR amplification using primers NPB29-6f (SEQ ID 
NO:28) and NPB29-7r (SEQ ID NO:29). The expected fragment was obtained and 
sequenced using the Big-Dye sequencing system. The ebhM-18 mutant was found to 
have two mutations resulting in changes in the amino acid sequence of the protein, 
glutamic acid 31 is changed to lysine and methionine 176 is changed to valine. The 

10 amino acid sequence of the mutant polypeptide encoded by SEQ ID NO:68 is 
depicted in SEQ ID NO:69. 

The nucleic acid sequence of SEQ ID NO:72 encodes a mutant £&/*24-16g8, 
which also exhibits improved bioconversion yield. This 5. rimosus transformant, 
eM24-16g8 identified in the screening of mutation 50 (primer NPB29mut50 (SEQ ID 

15 NO: 71)) also produced a greater yield of epothilone F. Plasmid DNA was isolated 
from the 5. rimosus culture and used as a template for PCR amplification using 
primers NPB29-6f (SEQ ID NO:28) and NPB29-7r (SEQ ID NO:29). The expected 
fragment was obtained and sequenced using the Big-Dye sequencing system. The 
eZ>/i24-16g8 mutant was found to have two additional mutations resulting in changes 

20 - in the amino acid sequence of the protein, methionine 176 is changed to alanine and 
isoleucine 130 is changed to threonine. The position 176 is the target of the 
mutagenesis, the change at 130 is an artifact of the mutagenesis procedure. The 
amino acid sequence of the mutant polypeptide encoded by SEQ ID NO:72 is 
depicted in SEQ ID NO:73. 

25 The nucleic acid sequence of SEQ ID NO:74 encodes a mutant efc/z24-16b9, 

which also exhibits improved bioconversion yield. This S. rimosus transformant, 
<?6/i24-16b9 identified in the screening of mutation 50 (primer NPB29mut50 (SEQ ID 
NO:71)) also produced a greater yield of epothilone F. Plasmid DNA was isolated 
from the S. rimosus culture and used as a template for PCR amplification using 

30 primers NPB29-6f (SEQ ID NO:28) and NPB29-7r (SEQ ID NO:29). The expected 
fragment was obtained and sequenced using the Big-Dye sequencing system. The 
e&/?24-16b9 mutant was found to have two additional mutations resulting in changes 
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in the amino acid sequence of the protein, methionine 176 is changed to serine and 
alanine 140 is changed to threonine. The position 176 is the target of the mutagenesis, 
the change at 140 is an artifact of the mutagenesis procedure. The amino acid 
sequence of the mutant polypeptide encoded by SEQ ID NO:74 is depicted in SEQ ID 
5 NO:75. 

A mixture composed of the plasmids pANT849<?&/?-24-16, pANT849eW*-10- 
53, pANT849eWi-24-16d8, pANT849eWi-24-16cll, pANT849efc/z-24-16-16, 
pant849eM-24-16-74, pANT849eW?-24-16b9, pANT849eM-M18 and pANT849<?M- 
24-16g8 for these nine mutant genes was deposited and accepted by an International 

10 Depository Authority under the provisions of the Budapest Treaty. The deposit was 
made on November 21, 2002 to the American Type Culture Collection ui 10801 
University Boulevard in Manassas, Virginia 201 10-2209. The ATCC Accession 
Number is PTA-4808. All restrictions upon public access to this mixture of plasmids 
will be irrevocably removed upon granting of this patent application. The deposit will 

15 be maintained in a public depository for a period of thirty year s after the date of 

deposit or five years after the last request for a sample or for the enforceable life of 
the patent, whichever is longer. The above-referenced mixture of plasmids was viable 
at the time of the deposit. The deposit will be replaced if viable samples cannot be 
dispensed by the depository. 

20 Thus, in accordance with another aspect of the present invention, there are 

provided isolated polypeptides of epothilone B hydroxylase and variants and mutants 
thereof and isolated polypeptides of ferredoxin or variants thereof. In one 
embodiment of the present invention, by "polypeptide" it is meant to include the 
amino acid sequence of SEQ ID NO: 2, and fragments or valiants, which retain 

25 essentially the same biological activity and/or function as this epothilone B 

hydroxylase. In another embodiment of the present invention, by "polypeptide" it is 
meant to include the amino acid sequence of SEQ ID NO:4, and fragments and/or 
variants, which retain essentially the same biological activity and/or function as this 
ferredoxin. 

30 By "variants" as used herein it is meant to include polypeptides with amino 

acid sequences with conservative amino acid substitutions as compared to SEQ ID 
NO: 2 or 4 which are demonstrated to exhibit similar biological activity and/or 
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function to SEQ ID NO:2 or 4. By "conservative amino acid substitutions" it is 
meant to include replacement, one for another, of the aliphatic amino acids such as 
Ala, Val, Leu and He, the hydroxyl residues Ser and Thr, the acidic residues Asp and 
Glu, and the amide residues Asn and Gin. Exemplary variant amino acid sequences 
5 of the present invention are depicted in SEQ ID NO:43-49 and the amino acid 
substitutions of these exemplary variants are described in Table 2, supra. 

By "mutants" as used herein it is meant to include polypeptides encoded by 
nucleic acid sequences with one or more point mutations, or deletions or additions of 
nucleic acids as compared to SEQ ID NO: 1 or 3, but which still have similar activity 

10 to the polypeptides encoded by SEQ ID NO: 1 or 3. In a preferred embodiment, 

mutations are made to the nucleic acid that alter the substrate specificity and/or yield 
from the polypeptide encoded thereby. A preferred region of mutation with respect to 
the epothilone B hydroxylase gene is that region of the nucleic acid sequence coding 
for the approximately 113 amino acid residues comprising the active site of the 

15 enzyme. Also preferred are mutants with at least one amino acid substitution at 
amino acid position GLU31, ARG67, ARG88, ILE92, ALA93, VAL106, ILE130, 
ALA140, MET176, PHE190, GLU 231, SER294, PHE237, or ILE365 of SEQ ID 
NO:l Exemplary mutants ebh25-l 9 <?M10-53, efc/z24-16, <?Wz24-16d8, eM24-16cll, 
ebh24A6-16, £?fc/z24-16-74, eZ?/z24-16g8, eZ?/i24-16b9 and the nucleic acid sequences 

20 encoding such mutants of the present invention are depicted in SEQ ID NO:3 1, 33, 
35, 61, 63, 65, 67, 69, 71, 73 and 75, and SEQ ID NO:30, 32, 34, 60, 62, 64, 66, 68, 
70, 72 and 74, respectively. 

A 3 -dimensional model of epothilone B hydroxylase has also been constructed 
in accordance with general teachings of Greer et al. (Comparative modeling of 

25 homologous proteins. Methods In Enzymology 202239-52, 1991), Lesk et al. 
(Homology Modeling: Inferences from Tables of Aligned Sequences. Curr. Op. 
Struc. Biol. (2) 242-247, 1992), arid Cardozo et al. (Homology modeling by the ICM 
method. Proteins 23, 403-14, 1995) on the basis of the known structure of a 
homologous protein EryF (PDB Code 1KIN chain A). Homology between these 

30 sequences is 34%. Alignment of the sequences of epothilone B hydroxylase (SEQ ID 
NO:2) and EryF (PDB Code 1KIN chain A; SEQ ED NO:76) is depicted in Figure 3. 
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A homology model of epothilone B hydroxylase based upon sequence alignment with 
EryF is depicted in Figure 4. 

An energy plot of the epothilone B hydroxylase model relative to EryF (PDB 
code 1 JIN) was also prepared and is depicted in Figure 5. An averaging window size 
5 of 51 residues was used at a given residue position to calculate the average of the 

energies of the 51 residues in the sequence that lie with the given residue at the central 
position. As shown in Figure 5, all energieis along the sequence lie below zero thus 
indicating that the modeled structure as set forth in Figure 4 and Appendix 1 is 
reasonable. 

10 The three-dimensional structure represented in the homology model of 

epothilone B hydroxylase of Figure 4 is defined by a set of structure coordinates as set 
forth in Appendix 1. The term "structure coordinates" refers to Cartesian coordinates 
generated from the building of a homology model. As will be understood by those of 
skill in the art, however, a set of structure coordinates for a protein is a relative set of 

15 points that define a shape in three dimensions. Thus, it is possible that an entirely 

different set of coordinates could define a similar or identical shape. Moreover, slight 
variations in die individual coordinates, as emanate from generation of similar* 
homology models using different alignment templates and/or using different methods 
in generating the homology model, will have minor effects on the overall shape. 

20 Variations in coordinates may also be generated because of mathematical 

manipulations of the structure coordinates. For example, the structure coordinates set 
forth in Appendix 1 could be manipulated by fractionalization of the structure 

- - ■ • - coordinates; integer additions or subtractions to sets of the structure coordinates, 

inversion of the structure coordinates or any combination of the above. 

25 Various computational analyses are therefore necessary to determine whether 

a molecule or a portion thereof is sufficiently similar to all or parts of epothilone B 
hydroxylase described above as to be considered the same. Such analyses may be 
carried out in current software applications, such as S YB YL version 6.7 or 
ENSIGHTII (Molecular Simulations Inc., San Diego, CA) version 2000 and as 
. 30 described in the accompanying User's Guides. 

For example, the superimposition tool in the program S YB YL allows 
comparisons to be made between different structures and different conformations of 
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the same structure. The procedure used in SYB YL to compare structures is divided 
into four steps: 1) load the structures to be compared; 2) define the atom equivalencies 
in these structures; 3) perform a fitting operation; and 4) analyze the results. Each 
structure is identified by a name. One structure is identified as the target (i.e., the 
5 fixed structure); the second structure (i.e., moving structure) is identified as the source 
structure. Since atom equivalency within S YBYL is defined by user input, for the 
purpose of this aspect of the present invention equivalent atoms are defined as protein 
backbone atoms (N, Ca, C and O) for all conserved residues between the two 
structures being compared. Further, only rigid fitting operations are considered . 

10 When a rigid fitting method is used, the working structure is translated and rotated to 
obtain an optimum fit with the target structure. The fitting operation uses an algorithm 
that computes the optimum translation and rotation to be applied to the moving 
structure, such that the root mean square difference of the fit over the specified pairs 
of equivalent atoms is an absolute minimum. This number, given in angstroms, is 

15 reported by S YBYL. 

For the purposes of the present invention, any homology model of epothilone 
B hydroxylase that has a root mean square deviation of conserved residue backbone 
atoms (N, Ca, C, O) of less than about 4.0 A when superimposed on the 
corresponding backbone atoms described by structure coordinates listed in Appendix 

20 1 are considered identical. More preferably, the root mean square deviation is less 
than about 3.0 A. More preferably the root mean square deviation is less than about 
2.0 A. 

~ For the purpose of this invention, any homology model of epothilone B 
hydroxylase that has a root mean square deviation of conserved residue backbone 
25 atoms (N, Ca, C, O) of less than about 2.0 A when superimposed on the 

corresponding backbone atoms described by structure coordinates listed in Appendix 
1 are considered identical. More preferably, the root mean square deviation is less 
than about 1.0 A. 

In another embodiment of the present invention, structural models wherein 
30 backbone atoms have been substituted with other elements which when superimposed 
on the corresponding backbone atoms have low root mean square deviations are 
considered to be identical. For example, an homology model where the original 
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backbone carbon, and/or nitrogen and/or oxygen atoms are replaced with other 
elements having a root mean square deviation of about 4.0 A, more preferably about 
3.0 A, even more preferably less than about 2A, when superimposed on the 
corresponding backbone atoms described by structure coordinates listed in Appendix 

5 1 is considered identical. 

The term "root mean square deviation" means the square root of the arithmetic 
mean of the squares of the deviations from the mean. It is a way to express the 
deviation or variation from a trend or object. For purposes of this invention, the "root 
mean square deviation" defines the variation in the backbone of a protein from the 

10 relevant portion of the backbone of the epothilone B hydroxylase portion of the 
complex as defined by the structure coordinates described herein. 

The present invention as embodied by the homology model enables the 
structure-based design of additional mutants of epothilone B hydroxylase. For 
example, using the homology model of the present invention, residues lying within 

15 10A of the binding site of epothilone B hydroxylase have now been defined. These 
residues include LEU39, GLN43, ALA45, MET57, LEU58, HIS62, PHE63, SER64, 
SER65, ASP66, ARG67, GLN68, SER69, LEU74, MET75, VAJL76, ALA77, 
ARG78, GLN79, BLE80, ASP84, LYS85, PRO86, PHE87, ARG88, PR089, SER90, 
LEU91, ILE92, ALA93, MET94, ASP95, HIS99, ARG103, PHE110, DLE155, 

20 PHE169, GLN170, CYS172, SER173, SER174, ARG175, MET176, LEU177, 
SER178, ARG179, ARG186, PHE190, LEU193, VAL233, GLY234, LEU235, 
ALA236, PHE237, LEU238, LEU239, LEU240, ILE241, ALA242, GLY243, 
HIS244, GLU245, THR246, THR247, ALA248, ASN249, MET250, LEU283, 
THR287, ILE288, ALA289, GLU290, THR291, ALA292, THR293, SER294, 

25 ARG295, PHE296, ALA297, THR298, GLU312, GLY313, VAL314, VAL315, 
GLY316, VAL344, ALA345, PHE346, GLY347, PHE348, VAL350, HIS351, 
GLN352, CYS353, LEU354, GLY355, GLN356, LEU358, ALA359, GLU362, 
LYS389, ASP391, SER392,THR393, ILE394 and TYR395 as set forth in Appendix 
1. Mutants with mutations at one or more of these positions are expected to exhibit 

30 altered biological function and/or specificity and thus comprise another embodiment 
of preferred mutants of the present invention. Another embodiment of preferred 
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mutants are molecules that have a root mean square deviation from the backbone 
atoms of said epothilone B hydroxylase of not more than about 4.0A. 

The structure coordinates of an epothilone B hydroxylase homology model or 
portions thereof are stored in a machine-readable storage medium. Such data may be 
5 used for a variety of purposes, such as drug discovery. 

Accordingly, another aspect of the present invention relates to machine- 
readable data storage medium comprising a data storage material encoded with the 
structure coordinates set forth in Appendix 1. 

The three-dimensional model structure of epothilone B hydroxylase can also 
10 be used to identify modulators of biological function and potential substrates of the 
enzyme. Various methods or combinations thereof can be used to identify such 
modulators. 

For example, a test compound can be modeled that fits spatially into a binding 
site in epothilone B hydroxylase, according to Appendix 1. Structure coordinates of 

15 amino acids within 10 A of the binding region of epothilone B hydroxylase defined by 
amino acids LEU39, GLN43, ALA45, MET57, LEU58, HIS62, PHE63, SER64, 
SER65, ASP66, ARG67, GLN68, SER69, LEU74, MET75, VAL76, ALA77, 
ARG78, GLN79, ILE80, ASP84, LYS85, PRO86, PHE87, ARG88, PR089, SER90, 
LEU91, ILE92, ALA93, MET94, ASP95, HIS99, ARG103, PHE110, ILE155, 

20 PHE169, GLN170, CYS172, SER173, SER174, ARG175, MET176, LEU177, 
SER178, ARG179, ARG186, PHE190, LEU193, VAL233, GLY234, LEU235, 
ALA236, PHE237, LEU238, LEU239, LEU240, ELE241, ALA242, GLY243, 
HIS244, GLU245, THR246, THR247, ALA248, ASN249, MET250, LEU283, 
THR287, DLE288, ALA289, GLU290, THR291, ALA292, THR293, SER294, 

25 ARG295, PHE296, ALA297, THR298, GLU312, GLY313, VAL314, VAL315, 
GLY316, VAL344, ALA345, PHE346, GLY347, PHE348, VAL350, HIS351, 
GLN352, CYS353, LEU354, GLY355, GLN356, LEU358, ALA359, GLU362, 
LYS389, ASP391, SER392,THR393, ILE394 and TYR395, and the coordinated 
heme group, HEM1 can also be used to identify desirable structural and chemical 

30 features of such modulators. Identified structural or chemical feiatures can then be 
employed to design or select compounds as potential epothilone B hydroxylase 
ligands. By structural and chemical features it is meant to include, but is not limited 
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to, covalent bonding, van der Waals interactions, hydrogen bonding interactions, 
charge interaction, hydrophobic bonding interaction, and dipole interaction. 
Compounds identified as potential epothilone B hydroxylase ligands can then be 
synthesized and screened in an assay characterized by binding of a test compound to 

5 epothilone B hydroxylase, or in characterizing the ability of epothilone B hydroxylase 
to modulate a protease target in the presence of a small molecule. Examples of assays 
useful in screening of potential epothilone B hydroxylase ligands include, but are not 
limited to, screening in silico, in vitro assays and high throughput assays. 

As will be understood by those of skill in the art upon this disclosure, other 

10 structure-based design methods can be used. Various computational structure-based 
design methods have been disclosed in the art. For example, a number of computer 
modeling systems are available in which the sequence of epothilone B hydroxylase 
and the epothilone B hydroxylase structure (i.e., atomic coordinates of epothilone B 
hydroxylase as provided in Appendix 1 and/or the atomic coordinates within 10A of 

15 the binding region as provided above) can be input. This computer system then 
generates the structural details of one or more these regions in which a potential 
epothilone B hydroxylase modulator binds so that complementary structural details of 
the potential modulators can be determined. Design in these modeling systems is 
generally based upon the compound being capable of physically and structurally 

20 associating with epothilone B hydroxylase. In addition, the compound must be able 
to assume a conformation that allows it to associate with epothilone B hydroxylase. 
Some modeling systems estimate the potential inhibitory or bindir^r effect of a 
potential epothilone B hydroxylase substrate or modulator prior to actual synthesis 
and testing. 

25 Methods for screening chemical entities or fragments for their ability to 

associate with a given protein target are also well known. Often these methods begin 
by visual inspection of the binding site on the computer screen. Selected fragments or 
chemical entities are then positioned in a binding region of epothilone B hydroxylase. 
Docking is accomplished using software such as INSIGHTH, QUANTA and SYBYL, 

30 following by energy minimization and molecular dynamics with standard molecular 
mechanic force fields such as, MMFF, CHARMM and AMBER. Examples of 
computer programs which assist in the selection of chemical fragment or chemical 
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entities useful in the present invention include, but are not limited to, GRID 
(Goodford, 1985), AUTODOCK (Goodsell, 1990), and DOCK (Kuntz et al. 1982). 

Upon selection of preferred chemical entities or fragments, their relationship 
to each other and epothilone B hydroxylase can be visualized and then assembled into 

5 a single potential modulator. Programs useful in assembling the individual chemical 

i 

entities include, but are not limited to CAVEAT (Bartlett et al. 1989) and 3D 
Database systems (Martin 1992). 

Alternatively, compounds may be designed de novo using either an empty 
active site or optionally including some portion of a known inhibitor. Methods of this 

10 type of design include, but are not limited to LUDI (Bohm 1992) and LeapFrog 
(Tripos Inc., St. Louis MO). 

Programs such as DOCK (Kuntz et al. 1982) can be used with the atomic 
coordinates from the homology model to identify potential ligands from databases or 
virtual databases which potentially bind the in the active site binding region which 

15 may therefore be suitable candidates for synthesis and testing. 

Also provided in the present invention are vectors comprising polynucleotides 
of the present invention and host cells which are genetically engineered with vectors 
of the present invention to produce epothilone B hydroxylase or active fragments and 
variants or mutants of this enzyme and/or ferredoxin or active fragments thereof. 

20 Generally, any vector suitable to maintain, propagate or express polynucleotides to 
produce these polypeptides in the host cell may be used for expression in this regard. 
In accordance with this aspect of the invention the vector may be, for example, a 
plasmid vector, a single- or double-stranded phage vector, or a single- or double- 
stranded RNA or DNA viral vector. Vectors may be extra-chromosomal or designed 

25 for integration into the host chromosome. Such vectors include, but are not limited to, 
chromosomal, episomal and virus-derived vectors e.g., vectors derived from bacterial 
plasmids, bacteriophages, yeast episomes, yeast chromosomal elements, and viruses 
such as baculoviruses, papova viruses, S V40, vaccinia viruses, adenoviruses, fowl 
pox viruses, pseudorabies viruses and retroviruses, and vectors derived from 

30 combinations thereof, such as those derived from plasmid and bacteriophage genetic 
elements, cosmids and phagemids. 
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Useful expression vectors for prokaryotic hosts include, but are not limited to, 
bacterial plasmids, such as those from E. coli, Bacillus or Streptomyces, including 
pBluescript, pGEX-2T, pUC vectors, pET vectors, ColEl, pCRl, pBR322, pMB9, 
pCW, pBMS200, pBMS2020, PIJ101, PU702, pANT849, pOJ260, pOJ446, 
5 pSET152, pKCl 139, pKC1218, pFD666 and their derivatives, wider host range 

plasmids, such as RP4, phage DNAs, e.g., the numerous derivatives of phage lambda, 
e.g., NM989, AXjTIO and A,GT11, and other phages, e.g., M13 and filamentous single 
stranded phage DNA. 

Vectors of the present invention for use in yeast will typically contain an 

10 origin of replication suitable for use in yeast and a selectable marker that is functional 
in yeast. Examples of yeast vectors useful in the present invention include, but are not 
limited to, Yeast Integrating plasmids (e.g., YIp5) and Yeast Replicating plasmids 
(the YRp and YEp series plasmids), Yeast Centromere plasmids (the YCp series 
plasmids), Yeast Artificial Chromosomes (YACs) which are based on yeast linear 

15 plasmids, denoted YLp, pGPD-2, 2\i plasmids and derivatives thereof, and improved 
shuttle vectors such as those described in Gietz et aL, Gene, 74: 527-34 (1988) 
(YIplac, YEplac and YCplac). 

Mammalian vectors useful for recombinant expression may include a viral 
origin, such as the S V40 origin (for replication in cell lines expressing the large 

20 T-antigen, such as COS 1 and COS7 cells), the papillomavirus origin, or the EB V 
origin for long term episomal replication (for use, e.g., in 293-EBNA cells, which 
constitutively express the EBV EBNA-1 gene product and adenovirus El A). 
Expression in mammalian cells can be achieved using a variety of plasmids, 
including, but not limited to, pSV2, pBC12BI, and p9 1023, pCDNA vectors as well as 

25 lytic virus vectors {e.g., vaccinia virus, adeno virus, and baculo virus), episomal virus 
vectors (e.g., bovine papillomavirus), and retroviral vectors (e.g., murine 
retroviruses). Useful vectors for insect cells include baculoviral vectors and pVL941. 

Selection of an appropriate promoter to direct mRNA transcription and 
construction of expression vectors are well known. In general, however, expression 

30 constructs will contain sites for transcription initiation and termination, and, in the 
transcribed region, a ribosome binding site for translation. The coding portion of the 
mature transcripts expressed by the constructs will include a translation initiating 
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codon at the beginning and a termination codon appropriately positioned at the end of 
the polypeptide to be translated. 

Examples of useful promoters for prokaryotes include, but are not limited to 
phage promoters such as phage lambda pL promoter, the trc promoter, a hybrid 
5 derived from the trp and lac promoters, the bacteriophage T7 promoter, the TAC or 
TRC system, the major operator and promoter regions of phage lambda, the control 
regions of fd coat protein, snpA promoter, melC promoter, ermE* promoter or the 
araB AD operon. Examples of useful promoters for yeast include, but are not limited 
to, the CYC1 promoter, the GAL1 promoter, the GAL10 promoter, ADH1 promoter, 

10 the promoters of the yeast a-mating system, and the GPD promoter. Examples of 
promoters routinely used in mammalian expression vectors include, but are not 
limited to, the CMV immediate early promoter, the HSV thymidine kinase promoter, 
the early and late SV40 promoters, the promoters of retroviral LTRs, such as those of 
the Rous Sarcoma Virus(RS V), and metallothionein promoters, such as the mouse 

15 metallothionein-I promoter. 

Vectors comprising the polynucleotides can be introduced into host cells using 
any number of well known techniques including infection, transduction, transfection, 
transvection and transformation. The polynucleotides may be introduced into a host 
alone or with additional polynucleotides encoding, for example, a selectable marker 

20 or ferredoxin reductase. In a preferred embodiment of the present invention the 

polynucleotide for epothilone B hydroxylase and ferredoxin are introduced into the 
host cell. Host cells for the various expression constructs are well known, and those 
of skill can routinely select a host cell for expressing the epothilone B hydroxylase 
and/or ferredoxin in accordance with this aspect of the present invention. Examples 

25 of mammalian expression systems useful in the present invention include, but are not 
limited to, the C127, 3T3, CHO, HeLa, human kidney 293 and BHK cell lines, and 
the COS-7 line of monkey kidney fibroblasts. 

Alternatively, as exemplified herein, epothilone B hydroxylase and ferredoxin 
can be expressed recombinantly in microorganisms. 

30 Accordingly, another aspect of the present invention relates to recombinantly 

produced microorganisms which express epothilone B hydroxylase alone or in 
conjunction with the ferredoxin and which are capable of hydroxylating a compound , 
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and in particular an epothilone, having a terminal alkyl group to produce ones having 
a terminal hydroxyalkyl group. The recombinantly produced microorganisms are 
produced by transforming cells such as bacterial cells with a plasmid comprising a 
nucleic acid sequence encoding epothilone B hydroxylase. In a preferred 
5 embodiment, the cells are transformed with a plasmid comprising a nucleic acid 
encoding epothilone B hydroxylase or mutants or variants thereof as well as the 
. nucleic acid sequence encoding ferredoxin located downstream of the epothilone B 
hydroxylase gene. Examples of microorganisms which can be transformed with these 
plasmids to produce the recombinant microorganisms of the present invention 
10 include, but are not limited, Escherichia coli, Bacillus megaterium, Amycolatopsis 
orientalis, Sorangium cellulosum, Rhodococcus eiythropolis, and Streptomyces 
species such as Streptomyces lividans, Streptomyces virginiae, Streptomyces 
venezuelae, Streptomyces albus, Streptomyces coelicolor, Streptomyces rimosus and 
Streptomyces griseus. 

15 The recombinantly produced microorganisms of the present invention are 

useful in microbial processes or methods for production of compounds, and in 
particular epothilones, containing a terminal hydroxyalkyl group. In general, the 
hydroxyalkyl-bearing product can be produced by culturing the recombinantly 
produced microorganism or enzyme derived therefrom, capable of selectively 

20 hydroxylating a terminal carbon or alkyl, in the presence of a suitable substrate in an 
aqueous nutrient medium containing sources of assimilable carbon and nitrogen, 
under submerged aerobic conditions. 

Suitable epothilones employed as substrate for the method of the present 
invention may be any such compound having a terminal carbon or terminal alkyl 

25 group capable of undergoing the enzymatic hydroxylation of the present invention. 
The starting material, or substrate, can be isolated from natural sources, such as 
Sorangium cellulosum, or they can be synthetically formed epothilones. Other 
substrates having a terminal carbon or terminal A group capable of undergoing an 
enzymatic hydroxylation can be employed by L ethods herein. For example, 

30 compactin can be used as a substrate, which upon aydroxylation forms the compound 
pravastatin. Methods for hydroxylating compactin to pravastatin via an 
Actinomadura strain are set forth in U.S. Patent 5,942,423 and U.S. Patent 6,274,360. 
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For example, using the recombinant microorganisms of the present invention 
at least one epothilone can be prepared as described in WO 00/39276, U. S. Serial. 
No. 09/468,854, filed December 21, 1999, the text of which is incorporated herein as 
if set forth at length. An epothilone of the following Formula I 
5 HO-CH 2 -(A 1 ) n -(Q)m-(A 2 ) 0 -E (I) 

where 

Ai and A 2 are independently selected from the group of optionally substituted 
C1-C3 alkyl and alkenyl; 

Q is an optionally substituted ring system containing one to three rings and at 
10 least one carbon to carbon double bond in at least one ring; 

n, m, and o are integers selected from the group consisting of zero and 1, 
where at least one of m or n or o is 1 ; and 

E is an epothilone core; can be prepared. 
This method comprises the steps of contacting at least one epothilone of the following 
15 formula II 

CH 3 -(A 1 ) n -(Q) m -(A 2 ) 0 -E (II) 
where Ai, Q, A 2 , E, n, m, and o are defined as above; 
with a recombinantly produced microorganism, or an enzyme derived 
therefrom, which is capable of selectively catalyzing the hydroxylation of formula n, 
20 and effecting said hydroxylation. 

In a preferred embodiment, the starting material is epothilone B. Epothilone B 
can be obtained from the fermentation of ' Sorangiwn cellulosum So ce90, as described 
in DE 41 38 042 and WO 93/10121. The strain has been deposited at the Deutsche 
Sammlung von Mikroorganismen (German Collection of Microorganisms) (DSM) 
25 under No. 6773. The process of fermentation is also described in Hofle, G., et al., 

Angew. Chem. Int. Ed. Engl, Vol 35, No. 13/14, 1567-1569 (1996). Epothilone B can 
also be obtained by chemical means, such as those disclosed by Meng, D., et al., J. 
Am. Chem, Soc, Vol. 119, No. 42, 10073-10092 (1996); Nicolaou, K., et al., 7. Am. 
Chem. Soc, Vol. 119, No. 34, 7974-7991 (1997) and Schinzer, D., et al., Chem. Eur. 
30 Vol. 5, No. 9, 2483-2491 (1999). 

Growth of the recombinantly produced microorganism selected for use in the 
process may be achieved by one of ordinary skill in the art by the use of appropriate 
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nutrient medium. Appropriate media for the growing of the recombinantly produced 
microorganisms include those that provide nutrients necessary for the growth of 
microbial cells. See, for example, T. Nagodawithana and J. M. Wasileski, Chapter 2: 
"Media Design for Industrial Fermentations" Nutritional Requirements of 
5 Commercially Important Microorganism , edited by T. W. Nagodawithana and G. 
Reed, Esteekay Associates, Inc., Milwaukee, WI, 18-45 (1998); T. L. Miller and B. 
W. Churchill, Chapter 10: "Substrates for Large-Scale Fermentations," Manual of 
Industrial Microbiology and Biotechnology , edited by AX. Demain and N. A. 
Solomon, American Society for Microbiology, Washington, D.C., 122-136 (1986). A 

10 typical medium for growth includes necessary carbon sources, nitrogen sources, and 
trace elements. Inducers may also be added to the medium. The term inducer as used 
herein, includes any compound enhancing formation of the desired enzymatic activity 
within the recombinantly produced microbial cell. Typical inducers as used herein 
may include solvents used to dissolve substrates, such as dimethyl sulfoxide, dimethyl 

15 formamide, dioxane, ethanol and acetone. Further, some substrates, such as 
epothilone B, may also be considered to be inducers. 

Carbon sources may include sugars such as glucose, fructose, galactose, 
maltose, sucrose, mannitol, sorbital, glycerol starch and the like; organic acids such as 
sodium acetate, sodium citrate, and the like; and alcohols such as ethanol, propanol 

20 and the like. Preferred carbon sources include, but are not limited to, glucose, 
fructose, sucrose, glycerol and starch. 

Nitrogen sources may include an N-Z amine A, corn steeped liquor, soybean 
meal, beef extract, "yeast extract, tryptone, peptone, cottonseed meal, peanut meal, 
amino acids such as sodium glutamate and the like, sodium nitrate, ammonium sulfate 

25 and the like. 

Trace elements may include magnesium, manganese, calcium, cobalt, nickel, 
iron, sodium and potassium salts. Phosphates may also be added in trace or 
preferably, greater than trace amounts. 

The medium employed for the fermentation may include more than one 
30 carbon or nitrogen source or other nutrient. 

For growth of the recombinantly produced microorganisms and/or 
hydroxylation according to the method of the present invention, the pH of the medium 
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is preferably from about 5 to about 8 and the temperature is from about 14°C to about 
37°C, preferably the temperature is 28°C. The duration of the reaction is 1 to 100 
hours, preferably 8 to 72 hours. 

The medium is incubated for a period of time necessary to complete the 
5 biotransformation as monitored by high performance liquid chromatography (HPLC). 
Typically, the period of time needed to complete the transformation is twelve to one 
hundred hours and preferably about 72 hours after the addition of the substrate. The 
medium is placed on a rotary shaker (New Brunswick Scientific Innova 5000) 
operating at 150 to 300 rpm and preferably about 250 rpm with a throw of 2 inches. 

10 The hydroxyalkyl-bearing product can be recovered from the fermentation 

broth by conventional means that are commonly used for the recovery of other known 
biologically active substances. Examples of such recovery means include, but are not 
limited to, isolation and purification by extraction with a conventional solvent, such as 
ethyl acetate and the like; by pH adjustment; by treatment with a conventional resin, 

15 for example, by treatment with an anion or cation exchange resin or a non-ionic 
adsorption resin; by treatment with a conventional adsorbent, for example, by 
distillation, by crystallization; or by recrystallization, and the like. 

The extract obtained above from the biotransformation reaction mixture can be 
further isolated and purified by column chromatography and analytical thin layer 

20; . chromatography. 

The ability of a recombinantly produced microorganism of the present 
invention to biotransform an epothilone having a terminal alkyl group to an 
epothilone having a terminal hydroxyalkyl group was demonstrated. In these " 
experiments, a culture comprising a Streptomyces lividans clone containing a plasmid 

25 with the ebh gene as described in more detail in Example 1 1 was incubated with an 
epothilone B suspension for 3 days at 30°with agitation. A sample of the incubate 
was extracted with an equal volume of 25% methanol: 75% n-butanpl, vortexed and 
allowed to settle for 5 minutes. Two hundred |ll1 of the organic phase was transferred 
to an HPLC vial and analyzed by HPLC/MS (Example 12). A product peak of 

30 epothilone F eluted at a retention time of 15.9 minutes and had a protonated molecular 
weight of 524. The epothilone B substrate eluted at 19.0 minutes and had a 
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protonated molecular weight.of 508. The peak retention times and molecular weights 
were confirmed using known standards. 

Rates of biotransformation of epothilone B by cells expressing ebh were also 
compared to rates of biotransformation by ebh mutants. Cells expressing ebh 

5 comprised a frozen spore preparation of. S. lividans (pANT849-efo/z). Cells 

expressing mutants comprises frozen spore preparations of S. lividans (pANT849- 
ebhlQ-53) and S. lividans (pANT849-eZ?/724-16). A frozen spore preparation of 5. 
lividans TK24 was used as the control. The cells were pre-incubated for several days 
at 30°C. Following this pre-incubation, epothilone B in 100% EtOH was added to 

10 each culture to a final concentration of 0.05% weight/volume. Samples were then 
taken at 0, 24, 48 and 72 hours with the exception of the 5. lividans (pANT849- 
ebh24-16) culture, in which the epothilone B had been completely converted to 
epothilone F at 48 hours. The samples were analyzed by HPLC. The results are 



calculated as a percentage of the epothilone B at time 0 hours. 








Epothilone B: 






Time (hours) 


TK24 


pANT849-eAA 


pANT849-<?WtlO-53 


P ANT849-eM24-16 


0 


100% 


100% 


100% 


100% 


24 


99% 


78% 


69% 


56% 


48 


87% 


19% 


39% 


0% 


72 


87% 


0% 


3% 




Epothilone F: 


Time (hours) 


TK24 


pANT849-eWi 


P ANT849-c6/il0-53 


pANT849-eW»24-16 


0 


0% 


0% 


0% 


0% 


24 


0% 


4% 


9% 


23% 


48 


0% 


21% 


29% 


52% 


72 


0% 


14% 


41% 





20 The ability of cells expressing ebh to biotransform compactin to pravastatin 

was also examined. In these experiments, frozen spore preparations of 5". lividans 
(pANT849) or S. lividans (pANT849-<?Wz) were grown for several days at 30°C. 
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Following the pre-incubation, an aliquot of each cell culture was transferred to a 
polypropylene culture tube, compactin was added to each culture tube, and the tubes 
were incubated for 24 hours, 30°C, 250 rpm. An aliquot of the culture broth was then 
extracted and compactin and pravastatin values relative to the control S. lividans 
(pANT849) culture were measured via HPLC. 

Compactin and pravastatin as a percentage of starting compactin 
concentration: 





S. lividans (pANT849) 


S. lividans (pANT849-eM) 


Compactin 


36% 


11% 


Pravastatin 


11% 


53% 



As discussed supra, mutant ebh25-l (SEQ ID NO:30) exhibits altered 
10 substrate specificity and biotransformation of epothilone B by this mutant resulted in 
a product with a different HPLC elution time than epothilone B or epothilone F. A 
sample of this unknown was analyzed by LC-MS and was found to have a molecular 
weight of 523 (M.W.), consistent with a single hydroxylation of epothilone B. The 
structure of the biotransformation product was determined as 24-hydroxyl-epothilone 
15 B, based on MS and NMR data (compared with data of epothilone B): 

26 




24-hydroxyi-epothilone B 
Formula A 



Molecular Formula: 
20 Molecular Weight: 
Mass Spectrum: 
LC/MS/MS: 
HRMS: 



C27H41NO7 S 
523 

ES+ (m/z): 524([M+H] + ), 506. 

+ESI (m/z): 524, 506, 476, 436, 320 

Calculated for [M+H] + : 524.2682; Found: 524.2701 
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HPLC (Rt) 7.3 minutes (on the analytical HPLC system) 

LC/NMR Observed Chemical Shifts 

Varian AS-600 (Proton: 599.624 MHz), 
Solvent D 2 0/CD 3 CN (5 1 .94): -4/6 
5 Proton: 87.30 (s, 1H), 6.43 (s, 1H), 5.30 (m, 1H), 4.35 (m, 1H), 

3.81 (m, 1H), 3.74 (m, 1H), 3.68 (m, 1H), 3.43 (m, 1H), 2.87 
(m, 1H), 2.66 (s, 3H), 2.40 (m, 2H), 1.58 (b, 1H), 1.48 (b, 1H), 
1.35 (m, 3H), 1.18 (s, 3H), 1.13 (s, 3H), 0.87 (m, 6H) 
*Peaks between 1.8-2.1 ppm were not observed due to solvent 
10 suppression. 

The proton chemical shift was assigned as follows: 
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19 


7.30 
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20 
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1.18 
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*SSP: no observed due to solvent suppression. 

Accordingly, the compositions and methods of the present invention are useful 
in producing known compounds that are microtubule-stabilizing agents as well as new 
compounds comprising epothilone analogs such as 24-hydroxyl-epothilone B 
15 (Formula A) and pharmaceutical^ acceptable salts thereof expected to be useful as 
microtabule-stabilizing agents. The microtubule stabilizing agents produced using 
these compositions and methods are useful in the treatment of a variety of cancers and 
other proliferative diseases including, but not limited to, the following; 

carcinoma, including that of the bladder, breast, colon, kidney, liver, lung, 
20 ovary, pancreas, stomach, cervix, thyroid and skin; including squamous cell 
carcinoma; 

hematopoietic tumors of lymphoid lineage, including leukemia, acute 
lymphocytic leukemia, acute lymphoblastic leukemia, B-cell lymphoma, T-cell 
lymphoma, Hodgkins lymphoma, non-Hodgkins lymphoma, hairy cell lymphoma and 
25 Burketts lymphoma- 
hematopoietic tumors of myeloid lineage, including acute and chronic 
myelogenous leukemias and promyelocyte leukemia; 

tumors of mesenchymal origin, including- fibrosarcoma and 
rhabdomyoscarcoma; 
30 - other tumors, including melanoma, seminoma, tetratocarcinoma, 
neuroblastoma and glioma; 
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tumors of the central and peripheral nervous system, including astrocytoma, 
neuroblastoma, glioma, and schwannomas; 

tumors of mesenchymal origin, including fibrosarcoma, rhabdomyosarcoma, 
and osteosarcoma; and 
5 - other tumors, including melanoma, xenoderma pigmentosum, 

keratoactanthoma, seminoma, thyroid follicular cancer and teratocarcinoma. 

Microtubule stabilizing agents produced using the compositions and methods 
of the present invention will also inhibit angiogenesis, thereby affecting the growth of 
tumors and providing treatment of tumors and tumor-related disorders. Such anti- 

10 angiogenesis properties of these compounds will also be useful in the treatment of 
other conditions responsive to anti-angiogenesis agents including, but not limited to, 
certain forms of blindness related to retinal vascularization, arthritis, especially 
inflammatory arthritis, multiple sclerosis, restinosis and psoriasis. 

Microtubule stabilizing agents produced using the compositions and methods 

15 of the present invention will induce or inhibit apoptosis, a physiological cell death 
process critical for normal development and homeostasis. Alterations of apoptotic 
pathways contribute to the pathogenesis of a variety of human diseases. Compounds 
of the present invention such as those set forth in formula I and II and Formula A, as 
modulators of apoptosis, will be useful in the treatment of a variety of human diseases 

20 with aberrations in apoptosis including, but not limited to, cancer and precancerous 
lesions, immune response related diseases, viral infections, degenerative diseases of 
the musculoskeletal system and kidney disease. 

Without wishing to be bound to any mechanism or morphology, microtubule 
stabilizing agents produced using the compositions and methods of the present 

25 invention may also be used to treat conditions other than cancer or other proliferative 
diseases. Such conditions include, but are not limited to viral infections such as 
herpesvirus, poxvirus, Epstein-Barr virus, Sindbis virus and adenovirus; autoimmune 
diseases such as systemic lupus erythematosus, immune mediated glomerulonephritis, 
rheumatoid arthi s, psoriasis, inflammatory bowel diseases and autoimmune diabetes 

30 mellitus; neurodegenerative disorders such as Alzheimer's disease, AJDS-related 
dementia, Parkinson's disease, amyotrophic lateral sclerosis, retinitis pigmentosa, 
spinal muscular atrophy and cerebellar degeneration; AIDS; myelodysplastic 
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syndromes; aplastic anemia; ischemic injury associated myocardial infarctions; stroke 

and reperfusion injury; restenosis; arrhythmia; atherosclerosis; toxin-induced or 

alcohol induced liver diseases; hematological diseases such as chronic anemia and 

aplastic anemia; degenerative diseases of the musculoskeletal system such as 

5 osteoporosis and arthritis; aspirin-sensitive rliinosinusitis; cystic fibrosis; multiple 

sclerosis; kidney diseases; and cancer pain. 

The following nonlimiting examples are provided to further illustrate the 
present invention. 
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EXAMPLES 

Example 1: Reagents 
R2 Medium was prepared as follows: 
5 A solution containing sucrose (103 grams), K2SO4 (0.25 grams) MgCl 2 «6H 2 0 

(10.12 grams), glucose (10 grams), Difco Casaminoacids (0.1 grams) and distilled 
water (800 ml) was prepared. Eighty ml of this solution was then poured into a 200 
ml screw capped bottle containing 2.2 grams Difco Bacto agar. The bottle was 
capped and autoclaved. At time of use, the medium was remelted and the following 
10 autoclaved solutions were added in the order listed: 

1 ml KH 2 P0 4 (0.5%) 

8 ml CaCl 2 *2H 2 0 (3.68%) 

1 .5 ml L-proline (20%) 

10 ml TES buffer (5.73%, adjusted to pH 7.2) 
15 0.2 ml Trace element solution containing ZnCl 2 (40mg), FeCl 3 »6H 2 O(200 mg), 

CuCl 2 #2H 2 0 (10 mg), MnCl 2 *4H 2 0 (10 mg), Na 2 B 4 O 7 *10H 2 O (10 mg), and 
(NH 4 )6Mo 7 0 2 4*H 2 0 

0.5 ml NaOH (lN)(sterilization not required) 

0.5 ml Required growth factors for auxotrophs (Histidine (50 (ig/ml); Cysteine 
20 (37 )Lig/ml); adenine, guanine, thymidine and uracil (7.5 |tig/ml); and Vitamins (0.5 
Hg/ml). 

R2YE medium was prepared in the same fashion as R2 medium. However, 5 ml of 
Difco yeast extract (10%) was added to each 100 ml flask at time of ui \ 

25 

P (protoplast) buffer was prepared as follows: 

A basal solution made up of the following was prepared: 
Sucrose (103 grams) 
K2SO4 (0.25 grams) 
30 MgCl 2 *6H 2 0 (2.02 grams) 

Trace Element Solution as described for R2 medium (2 ml) 
Distilled water to 800 ml 
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Eighty ml aliquots of the basal solution were then dispensed and autoclaved. Before 
use, the following was added to each flask in the order listed: 

1 ml KH2PO4 (0.5%) 

10 ml CaCl 2 #2H 2 0 (3.68%) 
5 TES buffer (5.75%, adjusted to pH 7.2) 

T (transformation) buffer was prepared by mixing the following sterile solutions: 

25 ml Sucrose (10.3%) 

75 ml distilled water 

1 ml Trace Element Solution as described for R2 medium 
10 1 mlK 2 S0 4 (2.5%) 

The following are then added to 9.3 mis of this solution: 
0.2 ml CaCl 2 (5M) 

0.5 ml Tris maleic acid buffer prepared from 1 M solution of Tris adjusted to 
pH 8.0 by adding maleic acid. 
15 For use, 3 parts by volume of the above solution are added to 1 part by weight of PEG 
1000, previously sterilized by autoclaving. 

L (lysis) buffer was prepared by mixing the following sterile solutions: 
100 ml Sucrose (10.3%) 
20 10 ml TES buffer (5.73%, adjusted to pH 7.2) 

1 ml K2SO4 (2.5%) 

1 ml Trace Element Solution as described for R2 medium 
1 ml KH 2 P0 4 (0.5%) 
0. 1 ml MgCl 2 *6H 2 0 (2.5 M) 
25 1 ml CaCl 2 (0.25 M) 
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CRM Medium 

A solution containing the following components was prepared in 1 liter of 
dH 2 0: glucose (10 grains), sucrose (103 grams), MgCl 2 *6H 2 0 (10.12 grams), BBL™ 
trypticase soy broth (15 grams) (Becton Dickinson Microbiology Systems, Sparks, 
5 Maryland, USA), and BBL™ yeast extract (5 grams) (Becton Dickinson 

Microbiology Systems). The solution was autoclaved for 30 minutes. Thiostrepton 
was added to a concentration of 10 |U,g/ml for cultures propagated with plasmids. 

Electroporation Buffer 

10 A solution containing 30% (wt/vol) PEG 1000, 10% glycerol, and 6.5 % 

sucrose was prepared in dH 2 0. The solution was sterilized by vacuum filtration 
through a 0.22 |Lim cellulose acetate filter. 

Example 2: Extraction of Chromosomal DNA from Strain SC15847 

15 Genomic DNA was isolated from an Amycolatopsis orientalis soil isolate 

strain designation SC15847 (ATCC PT-1043) using a guanidine-detergent lysis 
method, DNAzol reagent (Invitrogen, Carlsbad, California, USA). The SC15847 
culture was grown 24 hours at 28°C in F7 medium (glucose 2.2%, yeast extract 1.0%, 
malt extract 1.0 %, peptone 0.1%, pH 7.0). Twenty ml of culture was harvested by 

20 centrifiigation and resuspended in 20 ml of DNAzol, mixed by pipetting and 

centrifuged 10 minutes in the Beckman TJ6 centrifuge. Ten ml of 100% ethanol was 
added, inverted several times and stored at room temperature 3 minutes. The DNA 
was spooled on a glass pipette washed in 100% ethanol and allowed to air dry 10 
minutes. The pellet was resuspended in 500 jllI of 8mM NaOH and once dissolved it 

25 was neutralized with 30 fil of 1M HEPES pH7.2. 

Example 3: PCR Reactions 

PCR reactions were prepared in a volume of 50 |Lil, containing 200-500 ng of 
genomic DNA or 1.0 pi of the cDNA, a forward and reverse primer, and the forward 
30 primer being either P450-1+ (SEQ ID NO:23) or P450-la + (SEQ ID NO:24) or P450- 
2*(SEQ ID NO:25) and the reverse primer P450-3" (SEQ ID NO:27)or P450-2(SEQ 
ID NO:26). All primers were added to a final concentration of 1.4- 2.0 |llM. The PCR 
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reaction was prepared with 1 \il of Taq enzyme (2.5 units) (Stratagene), 5 jllI of Taq 
buffer and 4 yl of 2.5 mM of dNTPs with dHoO to 50 jil. The cycling reactions were 
performed on a Geneamp® PGR system with the following protocol: 95°C for 5 
minutes, 5 cycles [95°C 30 seconds, 37°C 15 seconds (30% ramp), 72°C 30 seconds], 
5 35 cycles (94°C 30 seconds, 65°C 15 seconds, 72°C 30 seconds), 72°C 7 minutes. 
The expected sizes for the reactions are 340 bp for the P450-1+ (SEQ ID NO:23) or 
P450~la + (SEQ ID NO:24) and P450-3" (SEQ ID NO:27) primer pahs, 240 bp for the 
P450-1+ (SEQ ID NO:23) and P450-2" (SEQID NO:26) primer pairs and 130 bp for 
the P450-2 + (SEQ ID NO:25) and P450-3" (SEQ ID NO:27) primer pairs. 

10 

Example 4: Cloning of Epothilone B Hydroxylase and Ferredoxin Genes 

Twenty |ig of SCI 5847 genomic DNA was digested with Bglll restriction 
enzyme for 6 hours at 37°C. A 30k nanosep column (Gelman Sciences, Ann Arbor, 
Michigan, USA) was used to concentrate the DNA and remove the enzyme and 

15 buffer. The reactions were concentrated to 40 |ui and washed with 200 pi of TE. The 
digestion products were then separated a 0.7% agarose gel and genomic DNA in the 
range of 12-15 kb was excised from the gel and purified using the Qiagen gel 
extraction method. The genomic DNA was then ligated to plasmid pWB19N (U.S. 
Patent 5,516,679), which had been digested with BamHI and dephosphorylated using 

20 the SAP I enzyme (Roche Molecular Biochemicals, Indianapolis, Indiana, catalog#l 
758 250). Ligation reactions were performed in a 15 |il volume with 1U of T4 DNA 
ligase (Invitrogen) for 1 hour at room temperature. One (il of the ligation was 
transformed to 100 \il of chemically competent DH10B cells (Invitrogen) and 100 yd 
plated to five LB agar plates with 30 |ig/ml of neomycin, 37°C overnight. 

25 Five nylon membrane circles (Roche Molecular Biochemicals, Indianapolis, 

Indiana) were numbered and marked for orientation. The membranes were placed on 
the plates 2 minutes and then allowed to dry for 5 minutes. The membranes were then 
placed on Whatman filter disks saturated with 10% SDS for 5 minutes, 0.5N NaOH 
with 1.5 M NaCl for 5 minutes, 1.5 M NaCl with 1.0 M Tris pH 8.0 for 5 minutes, 

30 and 15 minutes on 2X SSC. The filters were hybridized as described previously for 
the Southern hybridization. Hybridizing colonies were picked to 2 ml of TB with 30 
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Jig/ml neomycin and grown overnight at 37°C. Plasmid DNA was isolated using a 
miniprep column procedure (Mo Bio). This plasmid was named NPB29-1. 

Example 5: DNA Sequencing and Analysis 

5 The cloned PCR products were sequenced using fluorescent-dye-labeled 

terminator cycle sequencing, Big-Dye sequencing kit (Applied Biosystems, Foster 
city, California, USA) and were analyzed using laser-induced fluorescence capillary 
electrophoresis, ABI Prism 310 sequencer (Applied Biosystems). 

10 Example 6: Extraction of Total RNA 

Total RNA was isolated from the SC15847 culture using a modification of the 
Chomczynski and Sacchi method with a mono-phasic solution of phenol and 
guanidine isothiocyanate, Trizol reagent (Invitrogen). Five ml of an SC 15847 frozen 
stock culture was thawed and used to inoculate 100 ml of F7 media in a 500 ml 

15 Erlenmeyer flask. The culture was grown in a shaker incubator at 230 rpm, 30°C for 
20 hours to an optical density at 600 nm (OD 6 oo) of 9.0. The culture was placed in a 
16°C shaker incubator at 230 rpm for 20 minutes. Fifty-five milligrams of epothilone 
B was dissolved in 1 ml of 100% ethanol and added to the culture. A second ml of 
ethanol was used to rinse the residual epothilone B from the tube and added to the 

20 culture. The culture was incubated at 16°C, 230 rpm for 30 hours. Thirty ml of the 
culture was transferred to a 50 ml tube, 150 mg of lysozyme was added to the culture 
and the culture was incubated 5 minutes at room temperature. Ten ml of the culture 
was placed in a 50 ml Falcon tube and centrifuged 5 minutes, 4°C in a TJ6 centrifuge. 
Two ml of chloroform was added and the tube was mixed vigorously for 15 seconds. 

25 The tube was incubated 2 minutes at room temperature and centrifuged 10 minutes, 
top speed in the TJ6 centrifuge. The aqueous layer was transferred to a fresh tube and 
2.5 ml of isopropanol was added to precipitate the RNA. The tube was incubated 10 
minutes at room temperature and centrifuged 10 minutes, 4°C. The supernatant was 
removed, the pellet was rinsed with 70% ethanol am iried briefly under vacuum. The 

30 pellet was resuspended in 150 jxl of RNase-free dH 2 0. Fifty jllI of 7.5M LiCl was 
added to the RNA and incubated at -20°C for 30 minutes. The RNA was pelleted by 
centrifugatidn 10 minutes, 4°C in a microcentrifuge. The pellet was rinsed with 200 jil 
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of 70% ethanol, dried briefly under vacuum and resuspended in 150 pi of RNase free 
dHoO. 

The RNA was treated with DNasel (Ambion, Austin, Texas, USA). Twenty- 
five Ml of total RNA (5.3 jig/pl), 2.5 jlU of DNasel buffer, 1 .0 pi of DNase I added 
5 and incubated at 37°C for 25 minutes. Five jllI of DNase I inactivation buffer added, 
incubated 2 minutes, centrifuged 1 minute, the supernatant was transferred to a fresh 
tube. 

Example 7: cDNA Synthesis 

10 cDNA was synthesized from the total RNA using the Superscript II enzyme 

(Invitrogen). The reaction was prepared with 1 jllI of total RNA (5.3 fxg/jxl), 9 Jul of 
dH 2 0, 1 jllI of dNTP mix (10 mM), and 1 |jj of random hexamers. The reaction was 
incubated at 65°C for 5 minutes then placed on ice. The following components were 
then added: 4 |il of 1 st strand buffer, 1 pi of RNase Inhibitor, 2.0 jul of 0.1 M DTT, 
15 and 1 pi of Superscript II enzyme. The reaction was incubated at room temperature 10 
minutes, 42°C for 50 minutes and 70°C for 15 minutes. One jutl of RNaseH was added 
and incubated 20 minutes at 37°C, 15 minutes at 70°C and stored at 4°C. 

Example 8: DNA Labeling 

The PCR conditions used to amplify the P450 specific products from genomic 
DNA and cDNA were used to amplify the insert of plasmid pCRscript-29. Plasmid 
pCRscript-29 contains a 340bp PCR fragment amplified from SC15847 genomic 
DNA using primers P450 1 + (SEQ ID NO:23) and P450 3" (SEQ ID NO:27). Two fil 
of the plasmid prep was used as a template, with a total of 25 cycles. The amplified 
product was gel purified using the Qiaquick gel extraction system (Qiagen). The 
extracted DNA was ethanol precipitated and resuspended in 5 jlxI of TE, the yield was 
estimated to be 500 ng. This fragment was labeled with digoxigenin using the chem 
link labeling reagent (Roche Molecular Biochemicals, Indianapolis, Indiana catalog 
#1 836 463). Five \xl of the PCR product was mixed with 0.5 jxl of Dig-chem link and 
dH 2 0 added to 20 jjl. The reaction was incubated 30 minutes at 85°C and 5 (jj of stop 
solution added. The probe concentration was estimated at 20 ng/|nl 
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Example 9: Southern DNA Hybridization 

Ten |xl of genomic DNA (0.5 |ng/)il) was digested with BamHI, BglH, EcoRI, 
Hindm or NotI and separated at 12 volts for 16 hours. The gel was depurinated 10 
5 minutes in 0.25 N HC1 and transferred by vacuum to a nylon membrane (Roche 

Molecular Biochemicals) in 0.4 N NaOH 5 M Hg , 90 minutes using a vacuum blotter 
(Bio-Rad Laboratories, Inc. Hercules, California, USA catalog # 165-5000). The 
membrane was rinsed in 1 M ammonium acetate and UV-crosslinked using the 
Stratalinker UV Crosslinker (Stratagene). The membrane was rinsed in 2X SSC and 

10 stored at room temperature. 

The membrane was prehybridized 1 hour at 42°C in 20 ml of Dig Easy Hyb 
buffer (Roche Molecular Biochemicals). The probe was denatured 10 minutes at 65°C 
and then placed on ice. Five ml of probe in Dig-Easy Hyb at an approximate 
concentration on 20 ng/ml was incubated with the membrane at 42°C overnight. The 

15 membrane was washed 2 times in 2X SCC with 0.1% SDS at room temperature, then 
2 times in 0.5X SSC with 0.1% SDS at 65°C. The membrane was equilibrated in 
Genius buffer 1 (10 mM maleic acid, 15 mM NaCl; pH 7.5; 0.3% v/v Tween 20) 
(Roche Molecular Biochemicals, Indianapolis, Indiana) for 2 minutes, then incubated 
with 2% blocking solution (2% Blocking reagent in Genius Buffer l)(Roche 

20 Molecular Biochemicals Indianapolis, Indiana) for 1 hour at room temperature. The 
membrane was incubated with a 1:20,000 dilution of anti-dig antibody in 50 ml of 
blocking solution for 30 minutes. The membrane was washed 2 times, 15 minutes 
each in 50 ml of Genius buffer 1. The membrane was equilibrated for two minutes in 
Genius Buffer 3 (lOmM Tris-HCl, lOmM NaCl; pH 9.5). One ml of a 1:100 dilution 

25 of CSPD (disodium 3-(4-methoxyspiro{ l,2-dioxetane-3,2'-(5'- 

chloro)tricyclo[3.3.1.1 3?7 ]decan}-4-yl)phenyl phosphate) (Roche Molecular 
Biochemicals) in Genius buffer 3 was added to the membrane and incubated 5 
minutes at room temperature, then placed at 37°C for 15 minutes. The membrane was 
exposed to Biomax ML film (Kodak, Rochester, New York, USA) for 1 hour. 

30 

Example 10: E. coli Transformation 

-43- 



BNSDOCID: <WO 2004061 1 16A2J_> . 



WO 2004/061116 



PCT/US2003/034082 



Competent cells were purchased from Invitrogen. E. coli strain DH10B was 
used as a host for genomic cloning. The chemically competent cells were thawed on 
ice and 100 jllI aliquoted to a 17 x 100-mm polypropylene tube on ice. One jxl of the 
ligation mixture was added to the cells and incubated on ice for 30 minutes. The cells 
5 were incubated at 42°C for 45 seconds, then placed on ice 1-2 minutes. 0.9 ml pf 
SOC. medium(Invitrogen) was added and the cells were incubated one hour at 30- 
37°C at 200-240 rpm. Cells were plated on a selective medium (Luria agar with 
neomycin or ampicillin at a concentration of 30 \ig /ml or 100 [ig /ml respectively). 

10 Example 11: Transformation of Streptomyces lividans TK24 

Plasmid pWB 1 9N849 was constructed by digesting plasmid pWB 1 9N with 
Hindm and treating with SAP I and digesting plasmid pANT849 (Keiser, et aL, 2000, 
Practical Streptomyces Genetics, John limes ) with HindDI. The two linearized 
fragments were ligated 1 hour at room temperature with 1U of T4 DNA ligase. One |J,1 

1 5 of the ligation reaction was used to transform XL-1 Blue electrocompetent cells 

(Stratagene). The recovered cells were plated to LB neomycin (30 |ULg/ml) overnight at 
37°C. Colonies were picked to 2 ml of LB with 30 |LLg/ml neomycin and incubated 
overnight at 30°C. MoBio plasmid minipreps were performed on all cultures. 
Plasmids constructed from the ligation of pWB19N and pANT849 were determined 

20 by electrophoretic mobility on 0.7 % agarose. The plasmid pWB19N849 was digested 
with HindlH and Bgin to excise a 5.3 kb fragment equivalent to plasmid pANT849 
digested with Bgin and Hindm. This 5.3 kb fragment was purified on an agarose gel 
and extracted using the Qiaquick gel extraction system. 

A 1.469 kb DNA fragment containing the epothilone B hydroxylase gene and 

25 the downstream ferredoxin gene was amplified using PCR. The 50 jjl PCR reaction 
was composed of 5 |Lil of Taq buffer, 2.5 \xl glycerol, 1 jji of 20 ngl\x\ NPB29-1 
plasmid, 0.4 jil of 25 mM dNTPs, 1.0 \i\ each of primers NPB29-6F (SEQ ED NO:28) 
and NPB29-7R (SEQ ID NO:29) (5 pmole/(il), 38.1 [il of dH 2 0 and 0.5 jil of Taq 
enzyme (Stratagene). The reactions were performed on a Perkin Elmer 9700, 95°C for 

30 5 minutes, then 30 cycles (96°C for 30 seconds, 60°C 30 seconds, 72°C for 2 

minutes), and 72°C for 7 minutes. The PCR product was purified using a Qiagen 
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minielute column with the PCR cleanup procedure. The purified product was digested 
with Bglll and HindlH and purified on a 0.7 % agarose gel. A 1.469 kb band was 
excised from the gel and eluted using a Qiagen minielute column. Five |il of this PCR 
product was ligated with 2 of the Bglll, HindlH digested pANT849 vector in a 10 

5 |il ligation reaction. The reaction was incubated at room temperature for 24 hours and 
then transformed to S. lividans TK24 protoplasts. 

Twenty ml of YEME media was inoculated with a frozen spore suspension of 
S. lividans TK24 and grown 48 hours in a 125 ml bi-indent flask. Protoplasts were 
prepared as described in Practical Streptomyces Genetics. The ligation reaction was 

10 mixed with protoplasts, then 500 jllI of transformation buffer was added, followed 
immediately by 5 ml of P buffer. The transformation reactions were spun down 7 
minutes at 2,750 rpm, resuspended in 100 jil of P buffer and plated to one R2YE 
plate. The plate was incubated at 28°C for 20 hours then overlaid with 5 ml of LB 
0.7% agar with 250 |i.g/ml thiostrepton. After 7 days colonies were picked to an 

15 R2YE grid plate with 50 |Llg/ml of thiostrepton. The colonies were grown an 
additional 5 days at 28°C, then stored at 4°C. 

This recombinant microorganism has been deposited with the ATCC and 
designated PTA-4022. 

20 Example 12: Transformation of Streptomyces rimosus 
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The procedure of Pigac and Schrempf AppL Environ Microb., Vol. 61, No. 1, 
352-356(1995) was used to transform S. rimosus. S. rimosus strain R6 593 was 
cultivated in 20 ml of CRM medium at 30 °C on a rotary shaker (250 rpm). The cells 
were harvested at 24 hrs by centrifugation for 5 minutes, 5,000 rpm, 4 °C, and 
5 resuspended in 20 ml of 10% sucrose, 4°C, and centrifuged for 5 minutes, 5,000 rpm, 
4°C. The pellet was resuspended in 10 ml of 15% glycerol, 4 °C and centrifuged for 5 
minutes, 5,000 rpm, 4°C. The pellet was resuspended in 2 ml of 15% glycerol, 4 °C 
with 100 |Lig/ml lysozyme and incubated at 37 °C for 30 minutes, centrifuged for 5 
minutes, 5,000 rpm, 4 °C and resuspended in 2 ml of 15% glycerol, 4 °C. The 15% 

10 glycerol wash was repeated once and the pellet was resuspended in 1 to 2 ml of 
Electroporation Buffer. The cells were stored at -80°C in 50 - 200 pi aliquots. 

The ligations were prepared as described for the S. lividans transformation. 
After the incubation of the ligation reaction, the volume was brought to 100 jul with 
dH 2 0, NaCl was added to 0.3M, and the reaction extracted with an equal volume of 

15 24: 1 : 1 phenol :choroform isoamyl alcohol. Twenty \xg of glycogen was added and the 
ligated DNA was precipitated with 2 volumes of 100% ethanol at-20°C for 30 
minutes. The DNA was pelleted 10 minutes in a microcentrifuge, washed once with 
70% ethanol, dried 5 minutes in a speed- vac concentrator and resuspended in 5 fxl of 
dH 2 0. 

20 One frozen aliquot of cells was thawed at room temperature and divided, 50 

pi/ tube for each DNA sample for electroporation. The cells were stored on ice until 
use. DNA in 1 to 2 \xl of dH 2 0 was added and mixed. The cell and DNA mixture was 
transferred to a 2 mm gapped electrocuvette (Bio-Rad Laboratories, Richmond 
California USA) that was pre-chilled on ice. The cells were electroporated at a setting 

25 of 2 kV (lOkV/cm), 25|iF, 400 Q. using a Gene Pujser™ (Bio-Rad Laboratories). The 
cells were diluted with 0.75 to 1.0 ml of CRM (0-4 °C), transferred to 15 ml culture 
tubes and incubated with agitation 3hrs at 30 °C. The cells were plated on trypticase 
soy broth agar plates with 10-30 |ig/ml of thiostrepton. 

30 Example 13: High Performance liquid chromatography 

The liquid chromatography separation was performed using a Waters 2690 
Separation Module system (Waters Corp., Milford, MA, USA) and a column, 4.6 x 
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150 mm, filled with SymmetryShield RP 8 , particle size 3.5 |um (Waters Corp., 
Milford, MA, USA). The gradient mobile phase programming was used with a flow 
rate of 1.0 ml/minute. Eluent A was water/acetonitrile (20:1) + 10 mM ammonium 
acetate. Eluent B was acetonitrile/water (20:1). The mobile phase was a linear 
5 gradient from 12% B to 28 % B over 6 minutes and held isocratic at 28% B over 4 
minutes. This was followed by a 28% B to 100% B linear gradient over 20 minutes 
and a linear gradient to 12% B over two minutes with a 3 minute hold at 12% B. 

Example 14: Mass spectrometry 

10 The column effluent was introduced directly into the electrospray ion source of a 

ZMD mass spectrometer (Micromass, Manchester, UK). The instrument was calibrated 
using Test Juice reference standard (Waters Corp, Milford, MA, USA) and was delivered 
at a flow of 10 |jl/minute from a syringe pump (Harvard Apparatus, Holliston, MA, 
USA). The mass spectrometer was operated at a low mass resolution of 13.2 and a high 

15 mass resolution of 1 1.2. Spectra were acquired from using a scan range of m/z 100 to 

600 at an acquisition rate of 10 spectra /second. The ionization techno ^ employed, was 
positive electrospray (ES). The sprayer voltage was kept at 2900 V and the cone voltage 
of the ion source was kept at a potential of 17 V. 

20 Example 15: Use of the ebh gene sequence (SEQ ED NO:l) to isolate cytochrome 
P450 genes from other microorganisms 

Genomic DNA was isolated from a set of cultures (ATCC43491, 
ATCC14930, ATCC53630, ATCC53550, ATCC39444, ATCC4333 ^TCC35165) 
using the DNAzol reagent. The DNA was used as a template for PCR reactions using 
25 primers designed to the sequence of the ebh gene. Three sets of primers were used for 
amplification; NPB29-6f (SEQ ID NO:28) and NPB29-7r (SEQ ID NO:29), NPB29- 
16f (SEQ ID NO:50) and NPB29-17r (SEQ ID NO:51), and NPB29-19f (SEQ ED 
NO:52) and NPB29-20r (SEQ ID NO:53). 

PCR reactions were prepared in a volume of 20 jlxI, containing 200-500 ng of 
. 30 genomic DNA and a forward and reverse primer. All primers were added to a final 
concentration of 1.4- 2.0 |iM. The PCR reaction was prepared with 0.2 ]XL of 
Advantage™ 2 Taq enzyme (BD Biosciences Clontech, Palo Alto, California, USA), 
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2 jliJ of Advantage™ 2 Taq buffer and 0.2 |il of 2.5 mM of dNTPs with dH 2 0 to 20 
\ih The cycling reactions were performed on a Geneamp® 9700 PCR system or a 
Mastercycler® gradient (Eppendorf, Westbury, New York, USA) with the following 
protocol: 95°C for 5 minutes, 35 cycles (96°C 20 seconds, 54-69°C 30 seconds, 72°C 
5 2 minutes), 72°C for 7 minutes. The expected size of the PCR products is 

approximately 1469 bp for the NPB29-6f (SEQ ID NO:28) and NPB29-7r (SEQ ID 
NO:29) primer pair, 1034 bp for the NPB29-16f (SEQ ID NO:50) and NPB29-17r 
(SEQ ID NO:51) primer pair and 1318 bp for the NPB29-19f (SEQ ID NO:52) and 
NPB29-20r (SEQ ID NO:53) primer pair. The PCR reactions were analyzed on 0.7% 
10 agarose gels. PCR products of the expected size were excised from the gel and 
purified using the Qiagen gel extraction method. The purified products were 
sequenced using the Big-Dye sequencing kit and analyzed using an ABB 10 
sequencer. 
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Example 16: Construction of plasmid pPCRscript-eW* 

A 1.469 kb DNA fragment containing the epothilone B hydroxylase gene and 
the downstream ferredoxin gene was amplified using PCR. The 50 jil PCR reaction 
was composed of 5 \xl of Taq buffer, 2.5 |il glycerol, 1 jllI of 20 rig/jixl NPB29-1 
5 plasmid, 0.4 jol of 25 mM dNTPs, 1.0 pi each of primers NPB29-6f (SEQ ID NO:28) 
and NPB29-7r (SEQ ID NO:29) (5 pmole/pl), 38. 1 pi of dH 2 0 and 0.5 pi of Taq 
enzyme (Stratagene). The reactions were performed on a Geneamp® 9700 PCR 
system, with the following conditions; 95°C for 5 minutes, then 30 cycles (96°C for 
30 seconds, 60°C 30 seconds, 72°C for 2 minutes), and 72°C for 7 minutes. The PCR 

10 product was purified using a Qiagen Qiaquick column with the PCR cleanup 

procedure. The purified product was digested with Bgin and HindUI and purified on a 
0.7 % agarose gel. A 1.469 kb band was excised from the gel and eluted using a 
Qiagen Qiaquick gel extraction procedure. The fragments were then cloned into the 
pPCRscript Amp vector using the PCRscript Amp cloning kit. Colonies containing 

15 inserts were picked to 1-2 ml of LB (Luria Broth) with 100 pg/mrampicillin, 30- 

37°C, 16-24 hours, 230-300 rpm. Plasmid isolation was performed using the Mo Bio 
miniplasmid prep kit. The sequence of the insert was confirmed by cycle sequencing 
with the Big-Dye sequencing kit. This plasmid was named pPCRscript-efr/t. 

20 Example 17: Mutagenesis of the ebh gene for improved yield or altered 
specificity 

The Quikchange® XL Site-Directed Mutagenesis Kit and the Quikchange® 
Multi Site-Directed Mutagenesis kit, both from Stratagene were used to introduce 
mutations in the coding region of the ebh gene. Both of these methods employ DNA 

25 primers 35-45 bases in length containing the desired mutation (SEQ ID NO: 54-59 and 
71), a methylated circular plasmid template and PfuTurbo® DNA Polymerase (U.S. 
Patent Nos 5,545,552 and 5,866,395 and 5,948,663) to generate copies of the plasmid 
template incorporating the mutation carried on the mutagenic primers. Subsequent 
digestion of the reaction with the restriction endonuclease enzyme Ppnl, selectively 

30 digests the methylated plasmid template, but leaves the non-methylated mutated 

plasmid intact. The manufacturer's instructions were followed for all procedures with 
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the exception of the Dpnl digestion step in which the incubation time was increased 
from 1 hi' to 3 hrs. The pPCRscript-efc/z vector was used as the template for 
mutagenesis. 

One to two (il of the reaction was transformed to either XL 1 -Blue® 
5 electrocompetent or XLIO-Gold® ultracompetent cells (Stratagene). Cells were plated 
to a density of greater than 100 colonies per plate on LA (Luria Agar) 100 |ig/ffil 
ampicillin plates, and incubated 24-48 hrs at 30-37°C. The entire plate was 
resuspended in 5 ml of LB containing 100 |Lig/ml ampicillin. Plasmid was isolated 
directly from the resuspended cells by centrifuging the cells and then purifying the 

10 plasmid using the Mo Bio miniprep procedure. This plasmid was then used as a 

template for PCR with primers NPB29-6f (SEQ ID NO:28)and NPB29-7r (SEQ ID 
NO:29) to amplify a mutated expression cassette. Digestion of the 1.469 kb PCR 
product with the restriction enzymes Bgin and HindlH was used to prepare this 
fragment for ligation to vector pANT849 also digested with Bgin and HindlH. 

15 Alternatively, the resuspended cells were used to inoculate 20- 50 ml of LB 

containing 100 |LLg/ml ampicillin and grown 18-24 hrs at 30-37°C. Qiagen midi-preps 
were performed on the cultures to isolate plasmid DNA containing the desired 
mutation. Digestion with the restriction enzymes Bgin and Hindll was used to excise 
the mutated expression cassette for ligation to Bgin and Hindin digested plasmid 

20 pANT849. Screening of mutants was performed in S. lividans or S. rimosus as 
described. 

Alternatively, the method of Leung et al. f Techniqu e- A Journal of Methods in 
Cell and Molecular Biology , Vol. 1, No. 1, 11-15 (1989) was used to generate random 
mutation libraries of the ebh gene. Manganese and/or reduced dATP concentration is 

25 used to control the mutagenesis frequency of the Taq polymerase. The plasmid 

pCRscript-eZ?/x was digested with NotI to linearize the plasmid. The Polymerase buffer 
was prepared with 0.166 M (NH 4 ) 2 S0 4 , 0.67M Tris-HCl pH 8.8, 61 mM MgCl 2 , 67 
pM EDTA pH8.0, 1.7 mg/ml Bovine Serum Albumin). The PCR reaction was 
prepared with 10 jil of Not I digested pCRscript-efc/7. (0. lng/|il), 10 |xl of polymerase 

30 buffer, 1.0 |il of 1M P-mercaptoethanol, 10.0 jil of DMSO, 1.0 \d of NPB29-6f (SEQ 
ID NO:28) primer (100 pmole/|il), 1.0 \xl of NPB29-7r (SEQ ID NO:29) primer (100 
pmole/jxl), 10 |Jl of 5 mM MnCl 2 , 10.0 jxl 10 mM dGTP, 10.0 pi 2 mM dATP, 10 mM 
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dTTP, 10.0 |il 10 niM dCTP, and 2.0 |il Taq polymerase. dH 2 0 was added to 100 pi. 
Reactions were also prepared as described above but without MnCk. The cycling 
reactions were performed a GeneAmp® PCR system with the following protocol: 
95°C for 1 minute, 25-30 cycles (94 °C for 1 minute, 55 °C for 30 seconds, 72 °C for 4 
5 minutes), 72 °C for 7 minutes. The PCR reactions were separated on an agarose gel 
using a Qiagen spin column. The fragments were then digested with Bgin and HindDI 
and purified using a Qiagen spin column. The purified fragments were then ligated to 
Bgin and HindDI digested pANT849 plasmids. Screening of mutants was performed 
in 5. lividans and S. rimosus. 

0 

Table of Characterized Mutants 



Mutant 


Position 


Substitution 


Wild-type 


ebh2A-\6 


92 


Valine 


Isoleucine 




Z>D / 




jruciiy id.io.iii lie 


ebh25-\ 


195 


Serine 


Asparagine 




294 


Proline 


Serine 


ebhl0-53 


190 


Tyrosine 


Phenylalanine 




231 


Arginine 


Glutamic acid 


ebh24-16dS 


92 


Valine 


Isoleucine 




237 


Alanine 


Phenylalanine 




67 


Glutamine 


Arginine 


eM24-16cll 


92 


Valine 


Isoleucine 




93 


Glycine 


Alanine 




237 


Alanine 


Phenylalanine 




365 


Threonine 


Isoleucine 


eM24-16-16 


92 


Valine 


Isoleucine 




106 


Alanine 


Valine 




237 


Alanine 


Phenylalanine 


efc/i24-16-74 


88 


Histidine 


Arginine 




92 


Valine 


Isoleucine 




237 


Alanine 


Phenylalanine 


ebh-MIS 


31 


Lysine 


Glutamic acid 




176 


Valine 


Methionine 


<?M24-16g8 


92 


Valine 


Isoleucine 
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ebh2AA6b9 



237 
67 
130 
176 

92 

237 

67 

140 

176 



Alanine 
Glutamirie 
Threonine 
Alanine 

Valine 

Alanine 

Glutamine 

Threonine 

Serine 



Phenylalanine 
Arginine 
Isoleucine 
Methionine 

Isoleucine 

Phenylalanine 

Arginine 

Alanine 

Methionine 



Example 18: Comparison of epothilone B transformation in cells expressing ebh 
and mutants thereof 

5 In these experiments, twenty ml of YEME medium in a 125 ml bi-indented 

flask was inoculated with 200 \i\ of a frozen spore preparation of S. lividans TK24, S. 
lividans (pANT849-*fc/7.), S. lividans (pANT849-efc/zlO-53) or S. lividans (pANT849- 
ebh24-l6) and incubated 48 hours at 230 rpm, 30°C. Thiostrepton, 10 Jig/ml was 
added to media inoculated with S. lividans (pANT849-e&/i), S. lividans (pANT849- 

10 ebhlO-53) and S. lividans (pANT849-<?&/?24-16). Four ml of culture was transferred to 
20 ml of R5 medium in a 125 ml Erlenmeyer flask and incubated 18 hrs at 230 rpm, 
30°C. Epothilone B in 100% EtOH was added to each culture to a final concentration 
of 0.05% weight/volume. Samples were taken at 0, 24, 48 and 72 hours with the 
exception of the S. lividans (pANT849-eM24-16) culture, in which the epothilone B 

15 had been completely converted to epothilone F at 48 hours. The samples were 

analyzed by HPLC. Results were calculated as a percentage of the epothilone B at 
time 0 hours. 
Epothilone B: 



Time (hours) 


TK24 


pANT849-cWi 


P ANT849-cWilO-53 


P ANT849-<?Wi24-16 


0 


100% 


100% 


100% 


100% 


24 


99% 


78% 


69% 


56%. 


48 


87% 


19% 


39% 


0% 


72 


87% 


0% 


3% 





20 Epothilone F: 

Time (hours) TK24 pANT849-*£/i pANT849-e£/i 10-53 pANT849-*M24-16 
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0 0% 0% 0% 0% 

24 0% 4% 9% 23% 

4S 0% 21% 29% 52% 

72 0% 14% 41% 



Alternatively, the bioconversion of epothilone B to epothilone F was 
performed in 5. rimosus host cells transformed with expression plasmids containing 
the ebh gene and its variants or mutants. One-hundred |JJ of a frozen S. rimosus 

5 transformant culture was inoculated to 20 ml CRM media with 10 |Lig/ml thiostrepton 
and cultivated 16-24 hr, 30°C, 230- 300 rpm. Epothilone B in 100% ethanol was 
added to each culture to a final concentration of 0.05% weight/volume. The reaction 
was typically incubated 20- 40hrs at 30 °C, 230-300 rpm. The concentration of 
epothilones B and F was determined by HPLC analysis. 

10 Evaluation of mutants in S. rimosus 



Mutant 


Epothilone F yield 


ebh-Ml& 


55% 


<?M24-16d8 


75% 


ebl%2A-\6c\\ 


75% 


ebh24-l6-\6 


75% 


ebh1A-\6-l\ 


75% 


ebh24-l6b9 


80% 


e£/t24-16g8 


85% 



Example 19: Biotransformation of compactin to pravastatin 

Twenty ml of R2YE media with 10 |LLg/ml thiostrepton in a 125 ml flask was 
inoculated with 200 jliI of a frozen spore preparation of S. Uvidans (pANT849), S. 

15 Uvidans (pANT849-efr/z) and incubated 72 hours at 230 rpm, 28°C. Four ml of culture 
was inoculated to 20 ml of R2YB media and grown 24 hours at 230 rpm, 28°C. One 
ml of culture was transferred to a 15 ml polypropylene culture tube, 10 jlxI of 
compactin (40 mg/ml) was added to each culture and incubated for 24 hours, 28°C, 
250 rpm. Five hundred jxl of the culture broth was transferred to a fresh 15 ml 

20 polypropylene culture tube. Five hundred (Jl of 50 mM sodium hydroxide was added 
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and vortexed. Three ml of methanol was added and vortexed, the tube was 
centrifuged 10 minutes at 3000 rpm in a TJ-6 table-top centrifuge. The organic phase 
was analyzed by HPLC. Compactin and pravastatin values were assessed relative to 
the control 5. lividans (pANT849) culture. 



5 

Compactin and Pravastatin as a Percentage of Starting Compactin 
Concentration: 





S. lividans (pANT849) 


S. lividans (pANT849-eWi) 


Compactin 


36% 


11% 


Pravastatin 


11% 


53% 



Example 20: High performance liquid chromatography method for compactin and 

10 pravastatin detection 

The liquid chromatography separation was performed using a Hewlett 
Packardl090 Series Separation system (Agilent Technologies, Palo Alto, California, 
USA) and a column, 50x46 mm, filled with Spherisorb ODS2, particle size 5 \xm 
(Keystone Scientific, Inc, Bellefonte, Pennsylvania, USA). The gradient mobile 

15 phase programming was used with a flow rate of 2.0 ml/minute. Eluent A was water, 
10 mM ammonium acetate and 0.05% Phosphoric Acid. Eluent B was acetonitrile. 
The mobile phase was a linear gradient from 20% B to 90 % B over 4 minutes. 

Example 21: Structure determination of the biotransformation product of 
20 mutant ebh25-l 

Analytical HPLC was performed using a Hewlett Packard 1100 Series Liquid 
Chromatograph with a YMC Packed ODS-AQ column, 4.6 mm i.d. x 15 cm 1. A 
gradient system of water (solvent A) and acetonitrile (solvent B) was used: 20% to 
90% B linear gradient, 10 minutes; 90% to 20% linear gradient, 2 minutes. The flow 
25 rate was 1 ml/minute and UV detection was at 254 nm. 

Preparative HPLC was performed using the following equipment and 
conditions: 

Pump: Varian ProStar Solvent Delivery Module (Varian Inc., Palo Alto, California, 
USA). Detector: GynkotekUVD340S. 
30 Column: YMC ODS-A column (30mmID X 100 mm length, 5|i particle size). 
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Elution flow rate: 30 ml/minute 

Elution gradient: (solvent A: water; solvent B: acetonitrile), 20% B, 2 minutes; 20% 
to 60% B linear gradient, 18 minutes; 60% B, 2 minutes; 60 % to 90% B linear 
gradient, 1 minute; 90 % B, 3 minutes; 90 % to 20% B linear gradient, 2 minutes. 
5 Detection: UV, 210 nm. 

LC/NMR was performed as follows: 40 [xl of sample was injected onto a 
YMC Packed ODS-AQ column (4.6 mm i.d. x 15 cm 1). The column was eluted at 1 
ml/minute flow rate with a gradient system of D2O (solvent A) and acetonitrile-d3 
(solvent B): 30% B, 1 minute; 30% to 80% B linear gradient, 1 1 minutes. The eluent 

10 passed a UV detection cell (monito r ed at 254 nm) before flowing through a F19/H1 
NMR probe (60 jllI active volume) in Varian AS-600 NMR spectrometer. The 
biotransformation product was eluted at around 7.5 minutes and the flow was stopped 
manually to allow the eluent to remain in the NMR probe for NMR data acquisition. 

Isolation and analysis was performed as follows. The butanol/methanol extract 

15 (about 10 ml) was evaporated to dryness under nitrogen stream. One ml methanol 
was added to the residue (38 mg) and insoluble material was removed by 
centrifugation (13000 rpm, 2 min). 0.1 ml of the supernatant was used for LC/NMR 
study and the rest of 0.9 ml was subjected to the preparative HPLC (0.2-0.4 ml per 
injection). Two major peaks were observed and collected: peak A was eluted between 

20 14 and 15 minutes, while peak B was eluted between 16.5 and 17.5 minutes. 

Analytical HPLC analysis indicated that peak B was the parent compound, epothilone 
B (Rt 8.5 minutes), and peak A was the biotransformation product (Rt 7.3 minutes). 
The peak A fractions were pooled and MS analysis data was obtained with the pooled 
fractions. The pooled fraction was evaporated to a small volume, then was 

25 lyophilized to give 3 mg of white solid. NMR and HPLC analysis of the white solid 
(dissolved in methanol) revealed that the biotransformation product was partially 
decomposed during the drying process. 
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APPENDIX 1 



Atom No. 


Residue 


Atom Name 


X-coord 


Y-coord 


Z-coord 


1 


ALA9 


N 


-39.918 


-4.913 


-1.651 


2 


ALA9 


CA 


-38.454 


-5.033 


-1.537 


3 


ALA9 


C 


-37.953 


-4.886 


-0.099 


4 


ALA9 


O 


-38.625 


-4.31 


0.765 


5 


ALA9 


CB 


-37.809 


-3.967 


-2.415 


6 


THR10 


N 


-36.781 


-5.447 


0.146 


7 


THR10 


CA 


-36.187 


-5.437 


1.49 


8 


THR10 


C 


-34.916 


-4.585 


1.553 


9 


THR10 


O 


-34.016 


-4.735 


0.72 


10 


THR10 


CB 


-35.871 


-6.887 


1.846 


11 


THR10 


OG1 


-37.075 


-7.631 


1.717 


12 


THR10 


CG2 


-35.355 


-7.053 


3.271 


13 


LEU11 


N 


-34.858 


-3.699 


2.536 


14 


LEU11 


CA 


-33.669 


-2.853 


2.745 


15 


LEU11 


C 


-32.51 1 


-3.649 


3.353 


16 


LEU11 


O 


-32.706 


-4.468 


4.259 


17 


LEU11 


CB 


-34.033 


-1.707 


3.687 


18 


LEU11 


CG 


-35.079 


-0.78 


3.078 


19 


LEU11 


CD1 


-35.53 


0.265 


4.091 


20 


LEU11 


CD2 


-34.555 


-0.111 


1.81 


21 


PR012 


N 


-31.32 


-3.422 


2.823 


22 


PR012 


CA 


-30.121 


-4.119 


3.302 


23 


PR012 


C 


-29.652 


-3.606 


4.663 


24 


PR012 


O 


-29.656 


-2.397 


4.918 


25 


PR012 


CB 


-29.081 


-3.842 


2.259 


26 


PR012 


CG 


-29.597 


-2.771 


1.309 


27 


PR012 


CD 


-31.031 


-2.493 


1.729 


28 


LEU 13 


N 


-29.278 


-4.522 


5.54 


29 


LEU 13 


CA 


-28.676 


-4.118 


6.819 


30 


LEU 13 


C 


-27.183 


-3.88 


6.627 


31 


LEU 13 


O 


-26.449 


-4.806 


6.267 


32 


LEU 13 


CB 


-28.898 


-5.196 


7.872 


33 


LEU 13 


CG 


-30.374 


-5.354 


8.217 


34 


LEU 13 


CD1 


-30.587 


-6.516 


9.181 


35 


LEU13 


CD2 


-30.945 


-4.067 


8.802 


36 


ALA14 


N 


-26.72 


-2.741 


7.112 


37 


ALA14 


CA 


-25.355 


-2.266 


6.825 


38 


ALA14 


C 


-24.244 


-2.941 


7.634 


39 


ALA14 


O 


-23.058 


-2.719 


7.372 


40 


ALA14 


CB 


-25.31 1 


-0.764 


7.075 


41 


ARG15 


N 


-24.628 


-3.792 


8.569 


42 


ARG15 


CA 


-23.664 


-4.537 


9.379 


43 


ARG15 


C 


-23.478 


-5.983 


8.91 


44 


ARG15 


O 


-22.815 


-6.767 


9.599 


45 


ARG15 


CB 


-24.174 


-4.519 


10.81 


46 


ARG15 


CG 


-25.655 


-4.879 


10.84 


47 


ARG15 


CD 


-26.2 


-4.843 


12.26 


48 


ARG15 


NE 


-27.657 


-5.039 


12.256 


49 


ARG15 


cz 


-28.358 


-5.301 


13.36 


50 _ 


ARG15 


NH1 


.29.69 


-5.376 


13.3 


51 


ARG15 


NH2 


-27.735 


-5.412 


14.536 


52 


LYS16 


N 


-24.096 


-6.351 


7.798 
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53 


LYS16 


CA 


-24.016 


-7.741 


7.335 


54 


LYS16 


C 


-22.639 


-8.128 


6.807 


55 


LYS16 


O 


-21.959 


-7.359 


6.115 


56 


LYS16 


CB 


-25.061 


-7.977 


6.252 


57 


LYS16 


CG 


-26.466 


-7.985 


6.839 


58 


LYS16 


CD 


-26.605 


-9.079 


7.892 


59 


LYS16 


CE 


-28.002 


-9.092 


8.499 


60 


LYS16 


NZ 


-28.113 


-10.128 


9.537 


61 


CYS17 


N 


-22.317 


-9.392 


7.036 


62 


CYS17 


CA 


-21.061 


-10.004 


6.56 


63 


CYS17 


C 


-20.737 


-9.771 


5.066 


64 


CYS17 


O 


-19.662 


-9.205 


4.833 


65 


CYS17 


CB . 


-21.096 


-11.501 


6.864 


66 


CYS17 


SG 


-21 .33 


-11.937 


8.602 


67 


PR018 


N 


-21.635 


-10.003 


4.1 


68 


PR018 


CA 


-21 .293 


-9.756 


2.683 


69 


PR018 


C 


-21.123 


-8.291 


2.246 


70 


PR018 


O 


-21.013 


-8.061 


1.036 


71 


PR018 


CB 


-22.388 


-10.383 


1.878 


72 


PR018 


CG 


-23.509 


-10.812 


2.802 


73 


PR018 


CD 


-23.002 


-10.554 


4.207 


74 


PHE19 


N 


-21.137 


-7.33 


3.162 


75 


PHE19 


CA 


-20.792 


-5.947 


2.834 


76 


PHE19 


C 


-19.279 


-5.777 


2.788 


77 


PHE19 


O 


-18.789 


-4.92 


2.036 


78 


PHE19 


CB 


-21.36 


-5.007 


3.894 


79 


PHE19 


CG 


-22.8 


-4.568 


3.654 


80 


PHE19 


CD1 


-23.051 


-3.27 


3.232 


81 


PHE19 


CD2 


-23.856 


-5.444 


3.867 


82 


PHE19 


CE1 


-24.355 


-2.853 


3.003 


83 


PHE19 


CE2 


-25.159 


-5.03 


3.629 


84 


PHE19 


CZ 


-25.409 


-3.735 


3.197 


85 


SER20 


N 


-18.573 


-6.687 


3.449 


86 


SER20 


CA 


-17.102 


-6.717 


3.446 


87 


SER20 


C 


-16.569 


-7.839 


4.342 


88 


SER20 


O 


-16.632 


-7.723 


5.573 


89 


SER20 


CB 


-16.557 


-5.371 


3.929 


90 


SER20 


OG 


-17.236 


-5.019 


5.129 


91 


PR021 


N 


-15.974 


-8.867 


3.753 


92 


PR021 


CA 


-15.978 


-9.134 


2.304 


93 


PR021 


C 


-17.267 


-9.836 


1.856 


94 


PR021 


O 


-18.026 


-10.327 


2.702 


95 


PR021 


CB 


-14.8 


-10.047 


2.111 


96 


PR021 


CG 


-14.442 


-10.669 


3.455 


97 


PR021 


CD 


-15.306 


-9.949 


4.481 


98 


PR022 


N 


-17.551 


-9.859 


0.561 


99 


PR022 


CA 


-16.897 


-9.007 




100 


PR022 


C 


-17.4 


-7.575 


■0.^3 


101 


PR022 


O 


-18.341 


-7.371 


0.469 


102 


PR022 


CB 


-17.32 


-9.591 


-1.762 


103 


PR022 


CG 


-18.478 


-10.549 


-1.528 


104 


PR022 


CD 


-18.669 


-10.604 


-0.021 


105 


PR023 


N 


-16.687 


-6.605 


-0.842 


106 


PR023 


CA 


-17.224 


-5.241 


-0.897 
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107 


PR023 


C 


-18.525 


-5.21 


-1 .693 


108 


PR023 


O 


-18.524 


-5.083 


-2.925 


109 


PR023 


CB 


-16.159 


-4.417 


-1 .547 


110 


PR023 


CG 


-15.004 


-5.321 


-1.95 


111 


PR023 


CD 


-15.388 


-6.725 


-1 .509 


112 


GLU24 


N 


-19.62 


-5.122 


-0.956 


113 


GLU24 


CA 


-20.963 


-5.192 


-1 .547 


114 


GLU24 


C 


-21.415 


-3.843 


-2.088 


115 


GLU24 


O 


-22.323 


-3.794 


-2.93 


116 


GLU24 


CB 


-21.934 


-5.68 


-0.48 


117 


GLU24 


CG 


-23.27 


-6.137 


-1 .052 


118 


GLU24 


CD 


-23.982 


-7.017 


-0.024 


119 


GLU24 


OE1 


-24.613 


-7.981 


-0.433 


120 


GLU24 


OE2 


-23.833 


-6.745 


1.158 


121 


TYR25 


N 


-20.573 


-2.843 


-1 .878 


122 


TYR25 


CA 


-20.842 


-1.47 


-2.303 


123 


TYR25 


C 


-20.704 


-1.311 


-3.816 


124 


TYR25 


O 


-21.364 


-0.436 


-4.385 


125 


TYR25 


CB 


-19.828 


-0.568 


-1 .608 


126 


TYR25 


CG 


-19.616 


-0.882 


-0.128 


127 


TYR25 


CD1 


-20.662 


-0.753 


0.779 


128 


TYR25 


CD2 


-18.364 


-1 .298 


0.311 


129 


TYR25 


CE1 


-20.461 


-1 .062 


2.119 


130 


TYR25 


CE2 


-18.163 


-1.605 


1.65 


131 


TYR25 


CZ 


-19.213 


-1.492 


2.55 


132 


TYR25 


OH 


-19.026 


-1.859 


3.866 


133 


GLU26 


N 


-20.1 


-2.296 


-4.468 


134 


GLU26 


CA 


-20.009 


-2.293 


-5.928 


135 


GLU26 


C 


-21.404 


-2.483 


-6.52 


136 


GLU26 


O 


-21.92 


-1.572 


-7.177 


137 


GLU26 


CB 


-19.129 


-3.454 


-6.39 


138 


GLU26 


CG 


-17.813 


-3.593 


-5.628 


139 


GLU26 


CD 


-16.94 


-2.342 


-5.707 


140 


GLU26 


OE1 


-16.345 


-2.12 


-6.749 


141 


GLU26 


OE2 


-16.773 


-1.731 


-4.657 


142 


ARG27 


N 


-22.105 


-3.488 


-6.017 


143 


ARG27 


CA 


-23.437 


-3.805 


-6.538 


144 


ARG27 


C 


-24.504 


-2.909 


-5.921 


145 


ARG27 


O 


-25.496 


-2.591 


-6.59 


146 


ARG27 


CB 


-23.752 


-5.26 


-6.22 


147 


ARG27 


CG 


-22.7 


-6.189 


-6.812 


148 


ARG27 


CD 


-23.031 


-7.653 


-6.55 


149 


ARG27 


NE 


-23.146 


-7.926 


-5.108 


150 


ARG27 


CZ 


-22.251 


-8.648 


-4.428 


151 


ARG27 


NH1 


-21.16 


-9.11 


-5.043 


152 


ARG27 


NH2 


-22.428 


-8.879 


-3.126 


153 


LEU28 


N 


-24.197 


-2.331 


-4.771 


154 


LEU28 


CA 


-25.11 


-1.358 


-4:168 


155 


LEU28 


C 


-25.131 


-0.079 


-4.987 


156 


LEU28 


O 


-26.214 


0.286 


-5.45 


157 


LEU28 


CB 


-24,67 


-1.039 


-2.746 


158 


LEU28 


CG 


-24.868 


-2.224 


-1.81 


159 


LEU28 


CD1 


-24.303 


-1.916 


-0.43 


160 


LEU28 
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-26.34 


-2.609 


-1.716 
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ARG29 


N 


-23.969 


0.307 


-5.49 
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-6.327 
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-24.521 


1.334 


-7.677 


164 
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-25.271 


2.226 


-8:096 
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1.682 


-6.568 
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-7.336 
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2.941 


-7.71 1 
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-8.215 
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-8.109 


185 


GLU31 


C 


-29.207 
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-7.656 
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-9.061 


190 


GLU31 


OE1 


-30.013 
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5.408 


-5.869 
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-25.241 


6.254 


-3.758 
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-6.879 
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6.364 


-7.829 
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5.107 


-7.394 


205 


VAL34 


N 
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6.094 


-4.033 
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-27.83 


6.472 


-2.661 
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5.447 


-2.122 
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-30.01 


5.467 


-2.487 
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-28.483 


7.85 


-2.686 
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-28.789 


8.339 


-1 .275 
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CG2 


-27.616 


8.865 


-3.42 
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SER35 
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-28.344 


4.546 


-1.286 


213 


SER35 
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-29.186 


3.438 


-0.802 
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-29.512 


3.536 


0.688 
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SER35 
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-28.615 


3.692 


1.521 


216 
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-27.19 


2.169 
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• -30.785 


3.413 


1.025 
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-31.168 


3.431 


2.443 


220 
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2.072 


3.082 
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-31.516 


1.059 


2.741 
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3.779 


2.597 
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-33.016 


3.857 


4.076 
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-34.513 


4.047 
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-34.987 


5.35 


3.804 
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5.582 


3.523 
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4.59 


3.609 
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-36.662 


6.791 
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2.067 


3.974 
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-29.543 


0.855 


4.695 
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-29.982 


0.926 


6.152 


232 
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-30.313 


1.995 


6.684 


233 
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CB 


-28.03 


0.681 


4.608 


234 
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-27.591 


0.391 


3.177 
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-27.298 


1.898 


5.163 


236 
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-30.064 


-0.24 


6.761 
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-0.332 


8.18 
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-29.151 


-0.563 


9.016 
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9.003 


240 


LEU39 


N 


-28.764 


0.463 


9.75 
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-27.607 


0.399 


10.656 
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13.539 
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12.644 
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0.686 


12.434 
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O 


-33.59 . 


1.208 


11.997 
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N 


-31.471 


1.392 


12.713 
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CA 


-31.501 


2.847 


12.494 


267 
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-31.201 


3.16 


11.025 
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-30.079 


2.955 


10.551 
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-30.507 


0 rr 0 

3.53 


13.437 
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-30.681 
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5.05 


13.439 


271 


GLN43 


CD 


-29.873 


5.699 


14.567 


272 


GLN43 
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-30.31 


000 
6.682 


15.184 


273 


GLN43 


NE2 


-28.723 
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14.852 
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"f"l 1 l~"> A A 

THR44 
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-32.227 


3.582 


10.304 


275 


THR44 


CA 
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-32.096 


3.832 


8.859 
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8.534 
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-31.231 


6.071 
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278 


THR44 


CB 


-33.475 


4.077 


8.258 
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THR44 


OG1 


-34.009 


5.268 


8.823 


280 


THR44 


CG2 


-34.428 


2.923 


8.551 


281 


ALA45 


N 


-30.35 


4.799 


7.541 


282 


ALA45 


CA 


-29.426 


5.833 


7.07 


283 


ALA45 


C 


-29.16 


5.718 


5.572 


284 


ALA45 


O 


-29.105 


4.619 


5.009 


285 


ALA45 


CB 


-28.115 


5.705 


7.836 
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TRP46 


N 


-28.989 


6.859 


4.931 


287 
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CA 


-28.702 


6.865 


3.492 


288 


TRP46 


C 


-27.212 


6.698 


3.221 


289 


TRP46 


O 


-26.408 


7.589 


3.517 


290 


TRP46 


CB 


-29.185 


8.173 


2.881 


291 


TRP46 


CG 


-30.693 


8.309 


2.805 


292 


TRP46 


CD1 


-31.509 


9.009 


3.665 


293 


TRP46 


CD2 


-31.552 


7.723 


1.804 


294 


TRP46 


NE1 


-32.788 


8.894 


3.228 


295 


TRP46 


CE2 


-32.862 


8.146 


2.116 


296 


TRP46 


CE3 


-31 .324 


6.922 


0.701 


297 


TRP46 


C22 


-33.913 


7.774 


1.295 


298 


TRP46 
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-32.389 


6.538 


-0.105 


299 


TRP46 


CH2 


-33.68 


6.967 


0.19 
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N 


-26.863 


5.559 


2.652 


301 


ALA47 


CA 


-25.475 


5.257 


2.302 


302 


ALA47 


C 


-25.153 


5.708 


0.882 


303 


ALA47 


O 


-25.772 


5.272 


-0.1 


304 


ALA47 


CB 


-25.248 


3.756 


2.427 


305 


LEU48 


N 


-24.185 


6.602 


0.797 


306 


LEU48 


. CA 


-23.751 


7.129 


-0.501 


307 


LEU48 


C 


-22.648 


6.252 


-1 .067 


308 


LEU48 


O 


-21.546 


6.197 


-0.511 


309 


LEU48 


CB 


-23:222 


8.543 


-0.317 


310 


LEU48 


CG 


-24.27 


9.464 


0.289 


311 


LEU48 


CD1 


-23.707 


10.863 


0.454 


312 


LEU48 
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-25.524 


9.515 


-0.569 
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5.601 


-2.176 
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4.636 


-2.75 


O A C 
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5.214 


-3.907 
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-20.047 


4.803 


-4.09 
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THR49 


CB 


-22.774 


3.391 


-3.196 


318 
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-23.783 


3.769 


-4.125 


319 


THR49 


CG2 


-23.458 


2.703 


-2.02 


320 


ARG50 


N 


-21 .724 


6.2 


-4.616 


321 


ARG50 


CA 


-20.899 


6.838 


-5.655 


322 


ARG50 


C 


-20.007 


7.927 


-5.081 
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323 


ARG50 


O 


-20.456 


8.712 


-4.234 


324 


ARG50 


CB 


-21.737 


7.467 


-6.758 


325 


ARG50 


CG 


-22.426 


6.441 


-7.639 


326 


ARG50 


CD 


-22.852 


7.085 


-8.951 


327 


ARG50 


NE 


-23.597 


8.327 


-8.704 


328 


ARG50 


CZ 


-23.779 


9.27 


-9.629 


329 


ARG50 


NH1 


-24.462 


10.375 


-9.326 


330 


ARG50 


' NH2 


-23.274 


9.111 


-10.854 


331 


LEU51 


N 


-18.92 


8.175 


-5.797 


332 


LEU51 


CA 


-17.931 


9.19 


-5.399 


333 


LEU51 


C 


-18.52 


10.584 


-5.583 


334 


LEU51 


O 


-18.42 


11.426 


-4.682 


335 


LEU51 


CB 


-16.726 


9.066 


-6.33 


336 


LEU51 


CG 


-15.377 


9.193 


-5.621 


337 


LEU51 


CD1 


-14.233 


9.154 


-6.628 


338 


LEU51 


CD2 


-15.267 


10.433 


-4.746 


339 


GLU52 


N 


-19.404 


10.68 


-6.562 


340 


GLU52 


CA 


-20.088 


11.93 


-6.891 


341 


GLU52 


C 


-21.101 


12.314 


-5.811 


342 


GLU52 


O 


-21.114 


13.477 


-5.389 


343 


GLU52 


CB 


-20.821 


11.759 


-8.229 


344 


GLU52 


CG 


-19.897 


11.56 


-9.439 


345 


GLU52 


CD 


-19.749 


10.09 


-9.853 


346 


GLU52 


OE1 


-19.796 


9.24 


-8.971 


347 


GLU52 


OE2 


-19.502 


9.849 


-11.025 


348 


ASP53 


N 


-21.659 


11.313 


-5.146 


349 


ASP53 


CA 


-22.646 


11.572 


-4.096 


350 


ASP53 


C 


-21.953 


11.905 


-2.783 


351 


ASP53 


O 


-22.4 


12.804 


-2.063 


352 


ASP53 


CB 


-23.493 


10.322 


-3.876 


353 


ASP53 


CG 


-24.263 


9.94 


-5.133 


354 


ASP53 


OD1 


-24.319 


8.749 


-5.405 


355 


ASP53 


OD2 


-24.633 


10.838 


-5.878 


356 


ILE54 


N 


-20,75 


11.382 


-2.614 


357 


ILE54 


CA 


-19.991 


11.62 


-1 .387 


358 


ILE54 


C 


-19.301 


12-976 


-1.41 


359 


ILE54 


O 


-19.36 


13.7 


-0.409 


360 


ILE54 


CB 


-18.963 


10.509 


-1.269 


361 


ILE54 


CG1 


-19.674 


9.167 


-1.252 


362 


ILE54 
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-18.113 


10.671 


-0.015 


363 


ILE54 
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-18.677 


8.03 


-1.365 
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13.43 


-2.592 
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14.776 
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-19.44 


15.836 
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16.893 


-2.065 
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-17.551 


14.883 
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-16.293 


14.028 


-3.94 
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-15.498 


14.133 


-5.235 
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ARG55 
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-16.277 


13.61 


-6.367 


372 


ARG55 


CZ 


-15.712 


13.028 


-7.427 


373 


ARG55 


NH1 


-14.383 


12.947 


-7.513 


374 


ARG55 


NH2 


-16.475 


12.553 


-8.413 


375 


GLU56 


N 


-20.64 


15.438 


-3.068 


376 


GLU56 


CA 


-21.795 


16.331 


-2.984 
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CZ 


3014 


TYR395 


OH 


3015 


GLY396 


N 


3016 


GLY396 


CA 


3017 


GLY396 


c 
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O 


3019 
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N 
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C 
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3023 


LEU397 


CB 


-5.051 


-5.894 


9.311 


3024 


LEU397 


CG 


-5.773 


-6.502 


8.12 


3025 


LEU397 


CD1 


-5.037 


-6.225 


6.822 


3026 


LEU397 


CD2 


-5.979 


-7.996 


8.325 


3027 
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N 


-5.87 


-6.402 
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3028 
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CA 
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1 1 .336 
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-14.087 
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-16.439 


13.4 


8.866 
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What is Claimed is : 

1. An isolated nucleic acid sequence encoding epothilone B hydroxylase 
or a mutant or variant thereof. 

5 

2. The isolated nucleic acid sequence of claim 1 comprising SEQ ID NO: 
1, 30, 32, 34, 36, 37, 38, 39, 40, 41, 42, 60, 62, 64, 66, 68, 72 or 74. 

3. The isolated nucleic acid sequence of claim 1 comprising SEQ ID 

10 NO:l. 

4. The isolated nucleic acid sequence of claim 1 encoding a mutant with 
at least one amino acid substitution in an active site of the epothilone B hydroxylase 
enzyme. 

15 

5. The isolated nucleic acid sequence of claim 1 encoding a mutant with 
at least one amino acid substitution at amino acid GLU31, ARG67, ARG88, ELE92, 
ALA93, VAL106, ILE130, ALAMO, MET176, PHE190, GLU 231, SER294, 
PHE237, or ILE365 of SEQ ID NO:2. 

20 

6. The isolated nucleic acid sequence of claim 1 encoding a mutant with 
at least one amino acid substitution at amino acid LEU39, GLN43, ALA45, MET57, 
LEU58, HIS62, PHE63, SER64, SER65, ASP66, ARG67, GLN68, SER69, LEU74, 
MET75, VAL76, ALA77, ARG78, GLN79, ELE80, ASP84, LYS85, PRO86, PHE87, 

25 ARG88, PR089, SER90, LEU91, ILE92, ALA93, MET94, ASP95, HIS99, ARG103, 
PHE110, ILE155, PHE169, GLN170, CYS172, SER173, SER174, ARG175, 
MET176, LEU177, SER178, ARG179, ARG186, PHE190, LEU193, VAL233, 
GLY234, LEU235, ALA236, PHE237, LEU238, LEU239, LEU240, ILE241, 
ALA242, GLY243, HIS244, GLU245, THR246, THR247, ALA248, ASN249, 

30 MET250, LEU283, THR287, ILE288, ALA289, GLU290, THR291 , ALA292, 
THR293, SER294, ARG295, PHE296, ALA297, THR298, GLU312, GLY313, 
VAL314, VAL315, GLY316, VAL344, ALA345, PHE346, GLY347, PHE348, 
VAL350, mS351, GLN352, CYS353, LEU354, GLY355, GLN356, LEU358, 
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ALA359, GLU362, LYS389, ASP391, SER392, THR393, HJE394, or TYR395 of 
SEQ ID NO:2. 

7. The isolated nucleic acid sequence of claim 1 encoding a variant 
5 comprising SEQ ID NO:43, 44, 45, 46, 47, 48 or 49. 

8. A polypeptide encoded by the isolated nucleic acid sequence of claim 

1. 

10 9. An isolated nucleic acid molecule that is capable of hybridizing to a 

nucleic acid sequence of claim 2, or to the complementary sequence of said nucleic 
acid sequence, under hybridization conditions of 3X SSC at 65°C for 16 hours, said 
isolated nucleic acid molecule being capable of remaining hybridized to said nucleic 
acid sequence, or to the complementary sequence of said nucleic acid sequence, under 

15 wash conditions of 0.5X SSC, 55°C for 30 minutes. 

10. An isolated polypeptide comprising SEQ ID NO:2. 

20 1 1. An isolated mutant polypeptide of epothilone B hydroxylase of SEQ 

ID NO: 2 comprising an amino acid sequence with at least one amino acid substitution 
in an active site of epothilone B hydroxylase enzyme of SEQ ID NO: 2. 

12. An isolated mutant polypeptide of epothilone B hydroxylase of SEQ 
25 ID NO: 2 comprising an amino acid sequence with at least one amino acid substitution 
at amino acid GLU31, ARG67, ARG88, ILE92, ALA93, VAL106, ILE130, 
ALA140, MET176, PHE190, GLU 231, SER294, PHE237, or ILE365 of SEQ ID 
NO:2. 

30 13. An isolated mutant polypeptide of epothilone B hydroxylase of SEQ 

ID NO:2 comprising an amino acid sequence with at least one amino acid substitution 
at amino acid LEU39, GLN43, ALA45, MET57, LEU58, HIS62, PHE63, SER64, 
SER65, ASP66, ARG67, GLN68, SER69, LEU74, MET75, VAL76, ALA77, 
ARG78, GLN79, ILE80, ASP84, LYS85, PRO86, PHE87, ARG88, PR089, SER90, 

35 LEU91, HJ392, ALA93, MET94, ASP95, HIS99, ARG103, PHE1 10, ILE155, 
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10 



PHE169, GLN170, CYS172, SER173, SER174, ARG175, MET176, LEU177, 
SER178, ARG179, ARG186, PHE190, LEU 193, VAL23 3, GLY234, LEU235, 
ALA236, PHE237, LEU238, LEU239, LEU240, ILE241, ALA242, GLY243, 
HIS244, GLU245, THR246, THR247, ALA248, ASN249, MET250, LEU283, 
THR287, ILE288, ALA289, GLU290, THR291, ALA292, THR293, SER294, 
ARG295, PHE296, ALA297, THR298, GLU312, GLY313, VAL314, VAL315, 
GLY316, VAL344, ALA345, PHE346, GLY347, PHE348, VAL350, fflS351, 
GLN352, CYS353, LEU354, GLY355, GLN356, LEU358, ALA359, GLU362, 
LYS389, ASP391, SER392, THR393, ILE394, or TYR395 of SEQ ID NO:2. 

14. An isolated mutant polypeptide of epothilone B hydroxylase 
comprising SEQ ID NO: 31, 33, 35, 61, 63, 65, 67, 69, 71, 73 or 75. 

15. An isolated variant polypeptide of epothilone B hydroxylase 
15 comprising SEQ ID NO: 43, 44, 45, 46, 47, 48 or 49. 

16. An isolated nucleic acid sequence encoding a ferredoxin. 

17. The isolated nucleic acid sequence of claim 16 comprising SEQ ID 

20 NO:3. 

18. A polypeptide encoded by the isolated nucleic acid sequence of claim 

i6: 

25 19. An isolated nucleic acid molecule that is capable of hybridizing to the 

nucleic acid sequence set forth in SEQ ID NO:3, or to the complementary sequence of 
" the nucleic acid sequence set forth in SEQ ID NO:3, under hybridization conditions of 
3X SSC at 65°C for 16 hours, said isolated nucleic acid molecule being capable of 
remaining hybridized to the nucleic acid sequence set forth in SEQ ID NO:3, or to the 

30 complementary sequence of the nucleic acid sequence set forth in SEQ ID NO:3, 
under wash conditions of 0.5X SSC, 55°C for 30 minutes. 

20. A vector comprising the isolated nucleic acid sequence of claim 1 . 

35 21. The vector of claim 20 further comprising an isolated nucleic acid 

sequence encoding a ferredoxin. 
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22. A host cell comprising the vector of claim 20. 

23 . A host cell comprising the vector of claim 21. 

5 

24. A method for producing recombinant microorganisms which 
hydroxylate epothilones having a terminal alkyl group to produce epothilones having 
a terminal hydroxyalkyl group, said method comprising transfecting a microorganism 
with the vector of claim 20 or 21 . 

10 

25. A recombinantly produced microorganism that hydroxylates 
epothilones having a terminal alkyl group to produce epothilones having a terminal 
hydroxyalkyl group. 

15 26. The recombinant! produced microorganism of claim 25 wherein said 

microorganism expresses a nucleic acid sequence of SEQ ID NO: 1, S\ 32, 34, 36, 
37, 38, 39, 40, 41, 42, 60, 62, 64, 66, 68, 72 or 74. 

27. A method for the preparation of at least one epothilone of the 
20 following formula I 

HO-CH 2 -(A 1 ) n -(Q) m ~(A 2 ) 0 -E (I) 

where 

Ai and A 2 are independently selected from the group of optionally substituted C1-C3 
alkyl and alkenyl; 

25 Q is an optionally substituted ring system containing one to three rings and at least 
one carbon to carbon double bond in at least one ring; 

n, m, and o are integers selected from the group consisting of zero and 1, where at 
least one of m or n or o is 1 ; and 
E is an epothilone core; 
30 comprising the steps of contacting at least one epothilone of the folio ing formula II 

CH 3 -(A 1 ) n -(Q) m -(A 2 ) 0 -E (II) 
where Ai, Q, A 2 , E, n, m, and o are defined as above; 
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with a recombinantly produced microorganism, or an enzyme derived therefrom, 
which is capable of selectively catalyzing the hydroxylation of Formula H, and 
effecting said hydroxylation. 



28. 



A method for the preparation of an epothilone analog of Formula A 

.s 




OH 



OH 



O OH O 



said method comprising biotransforming epothilone B to the epothilone analog of 
Formula A by incubation with a mutant epothilone B hydroxylase enzyme comprising 
10 SEQIDNO.-31. 



29. A compound of Formula A 
S ° 

N 



OH 




OH O 



OH 



15 



or a pharmaceutical^ acceptable salt thereof. 



20 



30. A homology model of epothilone B hydroxylase having a root mean 
square deviation of conserved residue backbone atoms of less than about 4.0 A when 
superimposed on a corresponding backbone atoms described by structure coordinates 
listed in Appendix 1 . 
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31. A method for producing a mutant with altered biological properties, 
function, yield of a desired product, rate of reaction, substrate specificity, or activity 
as compared to epothilone B hydroxylase, said method comprising the steps of: 
identifying an amino acid of SEQ ID NO:2 to mutate; and mutating the identified 

5 amino acid to create a mutant protein. 

32. The method of claim 3 1 wherein a homology model of epothilone B 
hydroxylase having a root mean square deviation of conserved residue backbone 
atoms of less than about 4.0 A when superimposed on a corresponding backbone 

10 atoms described by structure coordinates listed in Appendix 1 is used to identify an 
amino acid of SEQ ID NO: 2 to mutate. 

33. The method of claim 31 wherein the identified amino acid is LEU39, 
GLN43, ALA45, MET57, LEU58, HIS62, PHE63, SER64, SER65, ASP66, ARG67, 
GLN68, SER69, LEU74, MET75, VAL76, ALA77, ARG78, GLN79, ILE80, ASP84, 
LYS85, PRO86, PHE87, ARG88, PR089, SER90, LEU91, ILE92, ALA93, MET94, 
ASP95, HIS99, ARG103, PHE110, 1LE155, PHE169, GLN170, CYS172, SER173, 
SER174, ARG175, MET176, LEU177, SER178, ARG179, ARG186, PHE190, 
LEU193, VAL233, GLY234, LEU235, ALA236, PHE237, LEU238, LEU239, 
LEU240, ILE241, ALA242, GLY243, HIS244, GLU245, THR246, THR247, 
ALA248, ASN249, MET250, LEU283, THR287, ILE288, ALA289, GLU290, 
THR291, ALA292, THR293, SER294, ARG295, PHE296, ALA297, THR298, 
GLU312, GLY313, VAL314, VAL315, GLY316, VAL344, ALA345, PHE346, 
GLY347, PHE348, VAL350, HIS351, GLN352, CYS353, LEU354, GLY355, 
GLN356, LEU358, ALA359, GLU362, LYS389, ASP391, SER392, THR393, 
ILE394, or TYR395 of SEQ ID NO:2. 

34. The method of claim 31 wherein the identified amino acid is GLU31, 
ARG67, ARG88, ELE92, ALA93, VAL106, ILE130, ALAMO, MET176, PHE190, 

30 GLU 23 1 , SER294, PHE237, or ILE365 of SEQ ID NO:2. 
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35. The method of claim 31 wherein the mutant protein improves yield of 
a desired product as compared to the yield of a desired product obtained using 
epothilone B hydroxylase. 

5 36. The method of claim 35 wherein the desired product is epothilone F. 

37. The method of claim 3 1 wherein the mutant improves the rate of 
reaction as compared to the rate of reaction using epothilone B hydroxylase. 

10 38. The method of claim 3 1 wherein the mutant exhibits altered substrate 

specificity as compared to substrate specificity of epothilone B hydroxylase. 

39. The method of claim 38 wherein amino acid SER294 is mutated. 

15 40. The method of claim 31 wherein the mutant , exhibits essentially 

similar biological activity or function to epothilone B hydroxylase. 

41. A machine-readable data storage medium comprising a data storage 
material encoded with structure coordinates set forth in Appendix 1. 
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Alignment used to design primers P450-l + and P450-la + 

STMSUACB tcctcatcgccggccacgagac (SEQ ID NO: 5) 

STMSUBCB tgctggtcgccggccacgagac (SEQ ID NO: 6) 

3702259 tgctcatcaccggccaggacac (SEQ ID NO: 7) 

SSU65940 — ctgttcgccgggcacgactc (SEQ ID NO: 8) 

STMOLEP tgctdatcgcgggccacgagac (SEQ ID NO: 9) 

SERCP450A tgctggtcgccgggcacgagac (SEQ ID NO: 10) 

Alignment used to design primers P450-2 + and P450-2" 

STMSUACB cggcgcggtggaggaactgct (SEQ ID NO: 11) 

STMSUBCB gggcgccgtcgaggagctgct (SEQ ID NO: 12) 

3702259 ccgcaccctggaggagctgct (SEQ ID NO: 13) 

SSU65940 cggcgcggtcgaggagatgct (SEQ ID NO : 14 ) 

STMOLEP cgcggcggtggaggagatgct (SEQ ID NO: 15) 

SERCP450A cggcgcgatcgaggagaccct (SEQ ID NO: 16) 

Alignment used to design primer P450-3" 

STMSUACB ttcggcttcggcgtgcaccagtgcctgggc (SEQ ID NO: 17) 

STMSUBCB ttcggcttcggcgtccaccagtgcctggga (SEQ ID NO: 18) 

3 70225 9 ttcggctggggcccccaccactgcctgggc (SEQ ID NO: 19) 

SSU65 94 0 ttcggtcacggcgtccacaagtgtcctggc (SEQ ID NO: 20) 

STMOLEP ttcgggcacggagcgcaccactgcatcggc (SEQ ID NO: 21) 

SERCP4 50A ttcggccacggcatccacttctgcgtgggc (SEQ ID NO: 22) 

FIG. 2 
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EPO-B MTDVEETTATLPLARKCPFSPP - - PEYERIjRRESPVSRVGLPSGQTAWALTRLiEDIREML 

UINA ATVPDLESDS FHVDWYST YAELRETAPVTPVRFL - GQDAVJLVTGYDEAKAAL 

**:* .* * .**. * : ** ** :* :: : * 

EPO-B SSPHFSSD- -RQSPSFPLMVARQI- -RREDKP-FRPSLIAMDPPEHGKARRDVVGEFTVK 

UINA SDLRLSSDPKJOCYPGVEVEFPAYLGFPEDVRNYFATNMGTSDPPTHTRLRKLVSQEFTVR 

* . : ;*** ; : *^ j p ♦ <; • * . . *** * . * . * **** . 

EPO-B RMKALQPRIQQIVDEHIDALLAGPKPADIjVQALSLPVPSIiVICELLGVPYSDHEFFQSCS 
UINA RVEAMRPRVEQ I TAELLDEV- GDS GWD I VDRFAHPLP I KV I CELLGVDEAARGAFGRWS 

*::*::**::**. * :* ; ... :: * : * ******** : . * * 

EPO-B SRMLSREVT - AEERMTAFESLENYLDELVTKKEANATEDDLLGRQILKQRESGEADHGEL 

UINA SEILVMDPERAEQRGQAAREWNFILDLVERRRTEPGDDIjLSALISVQDDDDGRLSADEL 
* . : * : ** : * * ;** : ** . :;: :.*. . .** 

EPO-B VGLAFLLIi I AGHETTANM I SLGT VTLLENPDQLAKI KADPGKTLAAI EELLR I FTIAETA 

UINA TSIALVLLIiAGFEASVSLIGIGTYLLLTHPDQLALVRADPSALPNAVEEILRYIAPPETT 
..:*::**:**.*::..:*.:** ** : ***** :: *** # *.★*.** .. _ ** . 

EPO-B TS RFATADVE I GGTL I RAGEGWGLSNAGNHDPDGFENPDTFD I ERGARHHVAFGFGVHQ 

UINA T-RFAAEEVEIGGVAIPQYSTVIiVANGAANRDPSQFPDPHRFDVTRDTRGHLSFGQGIHF 

* ***. : ***** # * _ * : ..*.*:**. * :*. **: *.:* * ; : ** * : * 

EPO-B CLGQNIiARIiELQIVFDTLFRRVPGIRIAVPVDELPFKHDSTIYGLHALPVTW-- 
UINA CMGRPIiAKLEGEVALRALFGRFPALSLGIDADDWWRRSLLLRGIDHLPVRLDG 
*.*. **.** : : , i : ** * . * m : .*:: : * : *** 

FIG. 3 
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SEQUENCE LISTING 

<110> Bristol-Myers Squibb Company 

<12 0> COMPOSITIONS AND METHODS FOR HYDROXYLATING EPOTHILONES 

<130> D0231 PCT2 

<150> US 10/321,188 

<151> 2002-12-17 

<160> 76 

<170> Patentln version 3.1 

<210> 1 

<211> 1186 

<212> DNA 

<213> Amycolatopsis orientalis 

<400> 1 



atgaccgacg 


tcgaggaaac 


caccgcgacc 


ttgccactgg cccgcaaatg 


cccgttttca 


60 


ccaccgcccg 


aatacgagcg gctccgccgg gaaagtccgg tttcccgggt 


cggtctcccc 


120 


tccggtcaaa ccgcttgggc 


gctcacccgg 


ctcgaagaca tccgcgaaat 


gctgagcagt 


180 


ccgcatttca gctccgaccg gcagagtccg tcgttcccgc tgatggtggc 


gcggcagatc 


240 


cggcgcgagg 


acaagccgtt 


ccgcccgtcc 


ctcatcgcga tggacccgcc 


ggaacacggc 


300 


aaggccaggc 


gtgacgtcgt 


cggggaattc 


accgtcaagc gcatgaaagc 


gcttcagcca 


360 


cgtattcagc 


agatcgtcga 


cgagcatatc 


gacgccctgc tcgccggccc 


caaacccgcc 


420 


gatctcgtcc 


aggcgctttc 


cctgccggtt 


ccgtccttgg tgatctgcga 


actgctcggt 


480 


gtcccctatt 


cggaccacga 


gttcttccag 


tcctgcagtt cccggatgct 


cagccgggaa 


540 


gtcaccgccg aagaacggat 


gaccgcgttc 


gagtcgctcg agaactatct 


cgacgaactc 


600 


gtcacgaaga 


aggaggcgaa 


cgccaccgag 


gacgacctcc tcggccgcca 


gatcctgaag 


660 


cagcgcgaat 


ccggcgaagc 


cgaccacggc 


gaactggtcg gtctggcgtt 


cctcctgctc 


720 


atcgcggggc 


acgagactac 


ggcgaacatg atctcgctcg gcacggtgac 


cctgctggag 


7 80 


aaccccgatc 


agctggcgaa gatcaaggcg gatccgggca agaccctcgc 


cgcgatcgag 


840 


gaactcctgc 


ggatcttcac 


catcgcggag 


acggcgacct cacgcttcgc 


cacggcggac 


900 


gtcgagatcg gcggcacgct 


catccgcgcg ggtgaaggcg tcgtcggcct 


gagcaacgcg 


960 


ggcaaccacg 


atccggacgg 


cttcgagaac 


ccggacacct tcgacatcga 


acgcggcgcg 


102 0 


cggcatcacg 


tcgccttcgg 


attcggtgtg 


caccaatgcc tcggccagaa 


cttggcgagg 


1080 


ttggaactcc 


agatcgtgtt 


cgatacgttg ttccggcgag tgccgggcat 


ccggatcgcc 


1140 



) 2Q0406 1 1 1 6A2 J_> 



WO 2004/061116 



PCT/US2003/034082 



gtaccggtcg acgaactgcc gttcaagcac gattcgacga tctacg 1186 

<210> 2 
<211> 404 
<212> PRT 

<213> Amycolatopsis orientalis 
<400> 2 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
15 10 15 

Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 

Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 

Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 

Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin lie 
65 70 75 80 

Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu lie Ala Met Asp Pro 
85 90 95 

Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 

Lys Arg Met Lys Ala Leu Gin Pro Arg lie Gin Gin lie Val Asp Glu 
115 120 125 

His lie Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 

Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 

Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 

Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 
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Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin lie Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Phe Leu Leu Leu 
225 230 235 240 



lie Ala Gly His Glu Thr Thr Ala Asn Met lie Ser Leu Gly Thr Val 
245 250 255 

Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys lie Lys Ala Asp Pro 
260 265 " 270 



Gly Lys Thr Leu Ala Ala lie Glu Glu Leu Leu Arg lie Phe Thr lie 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu lie Gly 
290 295 300 



Gly Thr Leu lie Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 . 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp lie 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin lie Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly lie Arg lie Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr lie Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val Thr Trp 
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<210> 3 
<211> 195 
<212> DNA 

<213> Amycolatopsis orientalis 
<400> 3 

atgaagatca tcgcggacac cgggaagtgc gtgggggcgg gccagtgcgt gctcaccgat 60 
cccgatctgt tcgaccagag cgaggacgac gggacggtcc tcctgctgaa cgccgagccc 12 0 
gaaggcgaag aggcggagga gaacgcgcgc accgccgtgc acatctgccc ggggcaggca 180 
ctttcgctcg cgtag 195 

<210> 4 
<211> 64 
<212> PRT 

<213> Amycolatopsis orientalis 
<400> 4 

Met Lys lie He Ala Asp Thr Gly Lys Cys Val Gly Ala Gly Gin Cys 
15 10 15 

Val Leu Thr Asp Pro Asp Leu Phe Asp Gin Ser Glu Asp Asp Gly Thr 
20 25 30 

Val Leu Leu Leu Asn Ala Glu Pro Glu Gly Glu Glu Ala Glu Glu Asn 
35 40 45 

Ala Arg Thr Ala Val His lie Cys Pro Gly Gin Ala Leu Ser Leu Ala 
50 55 60 



<210> 5 

<211> 22 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 5 

tcctcatcgc cggccacgag ac 22 

<210> 6 

<211> 22 

<212> DNA 

<213> Artificial sequence 
<220> 
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<223> Synthetic 

<400> .6 

tgctggtcgc cggccacgag ac 

<210> 7 

<211> 22 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 7 

tgctcatcac cggccaggac ac 

<210> 8 

<211> 20 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 8 

ctgttcgccg ggcacgactc 

<210> 9 

<211> 22 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 9 

tgctcatcgc gggccacgag ac 

<210> 10 

<211> 22 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 10 

tgctggtcgc cgggcacgag ac 

<210> 11 

<211> 21 

<212> DNA 
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<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 11 

cggcgcggtg gaggaactgc t 21 



<210> 12 

<211> 21 

<212> DNA 

<213> Artificial sequence 
<220> 

<22 3> Synthetic 

<400> 12 

gggcgccgtc gaggagctgc t 21 



<210> 13 

<211> 21 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 13 

ccgcaccctg gaggagctgc t 21 



<210> 14 

<211> 21 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 14 

cggcgcggtc gaggagatgc t 21 



<210> 15 

<211> 21 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 15 

cgcggcggtg gaggagatgc t 21 
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<210> 16 

<211> 21 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 16 

cggcgcgatc gaggagaccc t 21 

<210> 17 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 17 

ttcggcttcg gcgtgcacca gtgcctgggc 30 

<210> i8 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 18 

ttcggcttcg gcgtccacca gtgcctggga 30 

<210> 19 

<211> 3 0 _ 

<212>" DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 19 

ttcggctggg gcccccacca ctgcctgggc 30 



<210> 20 

<2ll> 30 

<212> DNA 

<213> Artificial 

<220> 

<223> Synthetic 

<400> 20 



sequence 
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ttcggtcacg gcgtccacaa gtgtcctggc 



30 



<210> 21 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 21 

ttcgggcacg gagcgcacca ctgcatcggc 30 

<210> 22 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 



<211> 25 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 23 

tgctgctsdt cgccggbcab gasac 25 



<210> 24 

<211> 25 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<220> 

<221> misc__f eature 

<222> (9).. (9) 

<223> n=a, c, g or t 



<400> 22 

ttcggccacg gcatccactt ctgcgtgggc 



30 



<210> 



23 



<400> 24 

tgmtssysnt cgscgsbcay gasac 



25 
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<210> 25 

<211> 24 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 25 

cggvgcsvts gaggarmtgc tgcg 24 

<210> 26 

<211> 24 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 26 

cgcagcakyt cctcsabsgc bccg 24 

<210> 27 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 27 

gcccaggcas ahcacsywg gcdybggctt 30 



<210> 28 

..<211> . 27 ........... . _ _ .. : : . 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 28 

gcgagatcta cctggggaag gacaacc . 27 



<210> 29 

<211> 27 

<212> DNA 

<213> Artificial sequence. 
<220> 

<223> Synthetic 

<400> 29 
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gcgaagctta cggacttgga ccctacg 2 7 

<210> 30 
<211> 1215 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 30 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg cccgcaaatg cccgttttca 60 

ccaccgcccg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 

tccggtcaaa ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 180 

ccgcatttca gctccgaccg gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 

cggcgcgagg acaagccgtt ccgcccgtcc ctcatcgcga tggacccgcc ggaacacggc 300 

aaggccaggc gtgacgtcgt cggggaattc accgtcaagc gcatgaaagc gcttcagcca 360 

cgtattcagc agatcgtcga cgagcatatc gacgccctgc tcgccggccc caaacccgcc 42 0 

gatctcgtcc aggcgctttc cctgccggtt ccgtccttgg tgatctgcga actgctcggt 480 

gtcccctatt cggaccacga gttcttccag tcctgcagtt cccggatgct cagccgggaa 540 

gtcaccgccg aagaacggat gaccgcgttc gagtcgctcg agagctatct cgacgaactc 600 

gtcacgaaga aggaggcgaa cgccaccgag gacgacctcc tcggccgcca gatcctgaag 660 

cagcgcgaat ccggcgaagc cgaccacggc gaactggtcg gtctggcgtt cttgctgctc 720. 

atcgcggggc acgagactac ggcgaacatg atctcgctcg gcacggtgac cctgctggag 7.80 

aaccccgatc agctggcgaa gatcaaggcg gatccgggca agaccctcgc cgcgatcgag 840 

gaactcctgc ggatcttcac catcgcggag acggcgaccc cacgcttcgc cacggcggac 900 

gtcgagatcg gcggcacgct catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 960 

ggcaaccacg atccggacgg cttcgagaac ccggacacct tcgacatcga acgcggcgcg 102 0 

cggcatcacg tcgccttcgg attcggtgtg caccaatgcc tcggccagaa cttggcgagg 108 0 

ttggaactcc agatcgtgtt cgatacc; -g ttccggcgag tgccgggcat ccggatcgcc 1140 

gtaccggtcg acgaactgcc gttcaac ^c gattcgacga tctacggcct ccacgccctg 12C 

ccggtcacct ggtag 12 i5 

<210> 31 ' 
<211> 404 
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<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 31 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
1 5 10 15 

Cys Pro Phe'Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 ~ 30 

Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp He Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 

Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin He 
65 .70 75 80 

Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu He Ala Met Asp Pro 
85 90 95 

Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 

Lys Arg Met Lys Ala Leu Gin Pro Arg He Gin Gin He Val Asp Glu 

. 115 .12 0 . 125 - - 

His He Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
13 0 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 ' 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Ser Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 



11 



WO 2004/061116 



PCTYUS2003/034082 



195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin lie Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Phe Leu Leu Leu 
225 230 235 240- 



Ile Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Pro Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val Thr Trp 



<210> 32 
<211> 1215 
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<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 32 



atgaccgacg tcgaggaaac 


caccgcgacc 


ttgccactgg cccgcaaatg cccgttttca 


60 


ccaccgcccg aatacgagcg 


gctccgccgg 


gaaagtccgg tttcccgggt 


cggtctcccc 


120 


tccggtcaaa 


ccgcttgggc 


gctcacccgg 


ctcgaagaca tccgcgaaat 


gctgagcagt 


180 


ccgcatttca gctccgaccg 


gcagagtccg 


tcgttcccgc tgatggtggc gcggcagatc 


240 


cggcgcgagg acaagccgtt 


ccgcccgtcc 


ctcatcgcga tggacccgcc 


ggaacacggc 


300 


aaggccaggc 


gtgacgtcgt 


cggggaattc 


accgtcaagc gcatgaaagc 


gcttcagcca 


360 


cgtattcagc 


agatcgtcga 


cgagcatatc 


gacgccctgc tcgccggccc 


caaacccgcc 


420 


gatctcgtcc 


aggcgctttc 


cctgccggtt 


ccgtccttgg tgatctgcga 


actgctcggt 


480 


gtcccctatt 


cggaccacga 


gttcttccag 


tcctgcagtt cccggatgct 


cagccgggaa 


540 


gtcaccgccg 


aagaacggat 


gaccgcgtac 


gagtcgctcg agaactatct 


cgacgaactc 


600 


gtcacgaaga 


aggaggcgaa 


cgccaccgag 


gacgacctcc tcggccgcca gatcctgaag 


660 


cagcgcgaat 


ccggcgaagc 


cgaccacggc 


cgcctggtcg gtctggcgtt 


cctcctgctc 


720 


atcgcggggc 


acgagactac 




atctcgctcg gcacggtgac cctgctggag 


780 


aaccccgatc 


agctggcgaa 


gatcaaggcg 


gatccgggca agaccctcgc 


cgcgatcgag 


840 


gaactcctgc 


ggatcttcac 


catcgcggag 


acggcgacct cacgcttcgc 


cacggcggac 


900 


- gtcgagatcg gcggcacgct 


catccgcgcg 


ggtgaaggcg tcgtcggcct 


gagcaacgcg 


960 


ggcaaccacg atccggacgg 


cttcgagaac 


ccggacacct tcgacatcga 


acgcggcgcg 


1020 


cggcatcacg tcgccttcgg 


attcggtgtg 


caccaatgcc tcggccagaa 


cttggcgagg 


1080 


ttggaactcc 


agatcgtgtt 


cgatacgttg 


ttccggcgag tgccgggcat 


ccggatcgcc 


1140 


gtaccggtcg acgaactgcc 


gttcaagcac 


gattcgacga tctacggcct 


ccacgccctg 


1200 



ccggtcacct ggtag 1215 

<210> 33 

<211> 404 

<212> PRT 

<213> Artificial sequence 

<220> . 

<223> Synthetic 
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<400> 33 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
15 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin lie 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu He Ala Met Asp Pro 
85 ^ 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg He Gin Gin He Val Asp Glu 
115 120 125 



His He Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Tyr Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Ser 
210 215 220 
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Gly. Glu Ala Asp His Gly Arg Leu Val Gly Leu Ala Phe Leu Leu Leu 
225 230 235 240 



lie Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 * ~ 270 



Gly Lys Thr Leu Ala Ala lie Glu Glu Leu Leu Arg He Phe Thr He 
275 280 " 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu lie Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 ^ 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp lie 

325 ' 330 ~ 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu. Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val Thr Trp 



<210> 34 

<211> 1215 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
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<400> 34 
atgaccgacg 


tcgaggaaac 


caccgcgacc 


ttgccactgg 


cccgcaaatg 


cccgttttca 


60 


ccaccgcccg 


aatacgagcg 


gctccgccgg 


gaaagtccgg tttcccgggt 


cggtctcccc 


120 


tccggtcaaa 


ccgcttgggc 


gctcacccgg 


ctcgaagaca tccgcgaaat 


gctgagcagt 


180 


ccgcatttca 


gctccgaccg 


gcagagtccg 


tcgttcccgc 


tgatggtggc 


gcggcagatc 


240 


cggcgcgagg 


acaagccgtt 


ccgcccgtcc 


ctcgtcgcga tggacccgcc 


ggaacacggc 


300 


aaggccaggc 


gtgacgtcgt 


cggggaattc 


accgtcaagc 


gcatgaaagc 


gcttcagcca 


360 


cgtattcagc 


agatcgtcga 


cgagcatatc 


gacgccctgc 


tcgccggccc 


caaacccgcc 


420 


gatctcgtcc 


aggcgctttc 


cctgccggtt 


ccgtccttgg 


tgatctgcga 


actgctcggt 


480 


gtcccctatt 


cggaccacga 


gttcttccag 


tcctgcagtt 


cccggatgct 


cagccgggaa 


540 


gtcaccgccg 


aagaacggat 


gaccgcgttc 


gagtcgctcg agaactatct 


cgacgaactc 


600 


gtcacgaaga 


aggaggcgaa 


cgccaccgag 


gacgacctcc 


tcggccgcca 


gatcctgaag 


660 


cagcgcgaat 


ccggcgaagc 


cgaccacggc 


gaactggtcg 


gtctggcggc 


gctcctgctc 


720 


atcg.cggggc 


acgagactac 


ggcgaacatg 


atctcgctcg gcacggtgac 


cctgctggag 


780 


aaccccgatc 


agctggcgaa 


gatcaaggcg 


gatccgggca 


agaccctcgc 


cgcgatcgag 


840 


gaactcctgc 


ggatcttcac 


catcgcggag 


acggcgacct 


cacgcttcgc 


cacggcggac 


. 900 


gtcgagatcg 


gcggcacgct 


catccgcgcg 


ggtgaaggcg 


tcgtcggcct 


gagcaacgcg 


960 


ggcaaccacg 


atccggacgg 


cttcgagaac 


ccggacacct 


tcgacatcga acgcggcgcg 


1020 


cggcatcacg 


tcgccttcgg 


attcggtgtg 


caccaatgcc 


tcggccagaa 


cttggcgagg 


1080 


ttggaactcc 


agatcgtgtt 


cgatacgttg 


ttccggcgag 


tgccgggcat 


ccggatcgcc 


1140 


gtaccggtcg 


acgaactgcc 


gttcaagcac 


gattcgacga 


tctacggcct 


ccacgccctg 


1200 


ccggtcacct 


ggtag 










1215 



<210> 35 

<211> 404 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 35 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
1 5 10 15 



16 



WO 2004/061116 



PCTVUS2003/034082 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 "* 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp He Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin He 
65 70 75 "~ 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu Val Ala Met Asp Pro 
85 • 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 * 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg He Gin Gin He Val Asp Glu 
115 120 125 



His lie Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 * 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val lie Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu. Leu Leu 
225 230 235 240 
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He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu lie Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 . 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 

Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 

Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 

Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser- Thr He Tyr Gly Leu His Ala Leu- 
385 390 395 400 

Pro Val Thr Trp 



<210> 36 
<211> 1104 
<212> DNA 

<213> Amycolatopsis orientalis 
<400> 36 

gcgaccttgc cgctggcccg caaatgcccg ttttcaccgc cgcccgaata cgagcggctt 60 
cgccgggaaa gtccggtttc ccgggtcggt ctcccgtccg gtcaaaccgc ttgggcgctc 120 
acccggctcg aggacatccg cgaaatgctg a^cagtccgc atttcagctc cgaccggcag 180 
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agtccgtcgt 


tcccgctgat 


ggtggcccgg 


cagatccggc gcgaggacaa gccgttccgc 


240 


ccgtccctca 


tcgcgatgga 


cccgccggaa 


cacagcaagg ccaggcgtga cgtcgtcggg 


300 


gaattcaccg 


tcaagcgcat 


gaaagcgctt 


cagccgcgta ttcagcagat cgtcgacgag 


360 


catatcgacg 


ccatgctcgc 


cggccccaaa. 


cccgccgatc tcgtccaggc gctttccctg 


420 


ccggttccgt 


ccttggtgat 


ctgcgaactg 


ctcggtgtcc cctattcgga ccacgagttc 


480 


ttccagtcct 


gcagttcccg 


gatgctcagc 


cgggaagtca ccgccgaaga acggatgacc 


540 


gcgttcgagt 


cgctcgagaa 


ctatctcgac 


gaactcgtca cgaagaagga ggcgaacgcc 


600 


accgaggacg 


acctcctcgg 


ccgccagatc 


ctgaagcagc gcgaaacggg cgaagccgac 


660 


cacggcgaac 


tcgtcgggct 


ggcgttcctg 


ctgctcatcg cgggacacga gacgacggcg 


720 


aacatgatct 


cgctcggcac 


ggcgaccctg 


ctggagaacc ccgaccagct ggcgaagatc 


780 


aaggccgatc 


cgggcaagac 


cctcgccgcg 


atcgaggagc tcctgcgggt cttcaccatc 


840 


gcggagacgg 


cgacctcacg 


cttcgccacg 


gcggacgtcg agatcggcgg cacgctcatc 


900 


cgcgcgggtg 


aaggcgtcgt 


cggcctgagc 


aacgcgggca accacgatcc ggaaggcttc 


960 


gagaacccgg 


acgccttcga 


catcgaacgc 


ggcgcgcggc accacgtcgc cttcggattc 


1020 


ggtgtgcacc 


aatgcctcgg 


ccagaacttg gcgaggttgg aactccagat cgtgttcgat 


1080 


acgttgttcc 


ggcgagtgcc 


gggc 




1104 



<210> 37 
<211> 1103 
<212> DNA 

<213> Amycolatopsis orientalis 
<400> 37 

gaccttgccg ctggcccgga aatgcccgtt ttcgccgccg cccgaatacg aacggcttcg 60 
ccgggaaagt ccggtttccc gggtcggtct cccgtccggt caaacggctt gggcgctcac 120 
ccggctcgaa gacatccgcg aaatgctgag cagcccgcat ttcagttccg accggcagag 180 
cccgtcgttc ccgctgatgg tcgcgcggca gatccgccgc gaggacaagc cgttccgccc 240 
ctccctcatc gcgatggatc cgccggaaca cagccgggcc aggcgtgacg tcgtcgggga 300 
attcaccgtc aagcggatga aggcgctcca gccgcgaatt cagcagatcg tcgacgaaca 360 
tctcgacgcc ctgctcgcgg gccccaaacc cgccgatctc gtccaggcgc tttccctgcc 420 
cgttccctcg ctggtgatct gcgaactgct cggcgtcccc tattcggacc acgagttctt 480 
ccagtcctgc agttccagga tgctcagccg ggaggtcacc gccgaagaac ggatgaccgc 540 
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gttcgagcag ctcgaaaact atctcgacga actggtcacc aagaaggagg cgaacgccac 600 

cgaggacgac ctcctcggcc gtcagatcct gaaacagcgg gaaacgggcg aggccgacca 660 

cggtgaactc gtcgggctgg cgttcctgct gctcatcgcc ggacacgaga ccacggcgaa 72 0 

catgatctcg ctcggcacgg tgaccctgct ggagaatccc gatcagctcg cgaagatcaa 78 0 

ggcagacccc ggcaagaccc tcgccgccat cgaggaactc ctgcgggtct tcacgatcgc 840 

ggaaacggcg acctcacgct tcgccacggc ggacgtcgag atcggcggaa cgctgatccg 900 

cgcgggggaa ggggtggtgg gcctgagcaa cgcgggcaac cacgatccgg acggcttcga 960 

gaacccggac accttcgaca tcgaacgcgg cgcgcggcat cacgtcgcgt tcggattcgg 102 0 

ggtgcaccag tgtctcggcc agaacttggc gaggttggaa ctccagatcg tcttcgatac 108 0 

gttgttccgg cgagtgccgg gcc 1103 

<210> 38 
<211> 817 
<212> DNA 

<213> Amycolatopsis orientalis 
<400> 38 

cttcacccgc gcggatgagc gtgccgccga tctcgacgtc cgccgtggcg aagcgtgagg . 6 0 

tcgccgtctc cgcgatggtg aagatccgca ggagttcctc gatcgcggcg agggtcttgc 12 0 

ccggatccgc cttgatcttc gccagctgat cggggttctc cagcagggtc accgtgccga 180 

gcgagatcat gttcgccgta gtctcgtgcc ccgcgatgag caggaggaac gccagaccga 240 

ccagttcgcc gtggtcggct tcgccggatt cgcgctgctt caggatctgg cggccgagga 3 00 

ggtcgtcctc ggtggcgttc gcctccttct tcgtgacgag. ttcgtcgaga tagttctcga 360 

gcgactcgaa cgcggtcatc cgttcttcgg cggtgacttc ccggctgagc atccgggaac 42 0 

tgcaggactg gaagaactcg tggtccgaat aggggacacc gagcagttcg cagat caeca 480 

aggaeggaac eggcagggaa agcgcctgga egagategge gggtttgggg ccggcgagca 540 

gggegtcgat atgetegteg aegatctget gaatacgtgg ctgaageget ttcatgeget 600 

tgacggtgaa ttccccgacg acgtcacgcc tggccttgcc gtgttccggc gggtccatcg 660 

cgatgaggga egggeggaac ggcttgtcct cgcgccggat ctgccgcgcc accatcagcg 72 0 

ggaacgaegg actctgccgg teggagctga aatgeggact gctcagcatt tegeggatgt 780 

cttcgagccg ggtgagcgcc caaigcggttt gaeegga 817 

<210> 39 
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<211> 1105 
<212> DNA 

<213> Amycolatopsis orientalis 
<400> 39 

ccgcgacctt gccgctggcc cgcaaatgcc cgttttcacc gccgcccgaa tacgagcggc 60 

ttcgccggga aagtccggtt tcccgggtcg gtctcccgtc cggtcaaacc gcttgggcgc 12 0 

tcacccggct cgaggacatc cgcgaaatgc tgagcagtcc gcatttcagc tccgaccggc 180 

agagtccgtc gttcccgctg atggtggccc ggcagatccg gcgcgaggac aagccgttcc 240 

gcccgtccct catctcgatg gacccgccgg aacacagcaa ggccaggcgt gacgtcgtcg 300 

gggaattcac cgtcaagcgc atgaaagcgc ttcagccgcg tattcagcag atcgtcgacg 360 

agcatatcga cgccctgctc gccggcccca aacccgccga tctcgtccag gcgctttccc 42 0 

tgccggttcc gtccttggtg atctgcgaac tgctcggtgt cccctattcg gaccacgagt 480 

tcttccagtc ctgcagttcc cggatgctca gccgggaagt caccgccgaa gaacggatga 540 

ccgcgttcga gtcgctcgag aactatctcg acgaactcgt cacgaagaag gaggcgaacg 600 

ccaccgagga cgacctcctc ggccgccaga tcctgaagca gcgcgaaacg ggcgaagccg 660 

accacggcga actggtcggg ctggcgttcc tcctgctcat cgcgggacac gagacgacgg 72 0 

cgaacatgat ctcgctcggc acggcgaccc tgctggagaa ccccgaccag ctggcgaaga 780 

tcaaggccga- tccgggcaag accctcgccg cgatcgagga gctcctgcgg gtcttcacca 840 

tcgcggagac:;ggcgacctca cgcttcgcca cggcggacgt cgagatcggc ggcacgctca 900 

tccgcgcggg tgaaggcgtc gtcggcctga gtaacgcggg caaccacgat ccggaaggct 960 

tcgagaaccc ggacgccttc gacatcgaac gcggcgcgcg gcaccacgtc gccttcggat 1020 

tcggtgtgca ccaatgcctc ggccagaact tggcgaggtt ggaactccag atcgtgttcg 1080 

atacgttgtt ccggcgagtg ccggg 1105 

<210> 40 
<211> 1304 
<212> DNA 

<213> Amycolatopsis orientalis 
<400> 40 

ccttgccact ggcccgcaaa . tgcccgtttt caccaccgcc cgaatacgag cggctccgcc 60 

gggaaagtcc ggtttcccgg gtcggtctcc cctccggtca aaccgcttgg gcgctcaccc 120 

ggctcgaaga catccgcgaa atgctgagca gtccgcattt cagctccgac cggcagagtc 180 

cgtcgttccc gctgatggtg gcgcggcaga tccggcgcga ggacaagccg ttccgcccgt 240 
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ccctcatcgc 


gatggacccg 


ccggaacacg 


gcaaggccag 


gcgtgacgtc gtcggggaat 


300 


tcaccgtcaa 


gcgcatgaaa 


gcgcttcagc 


cacgtattca 


gcagatcgtc gacgagcata 


360 


tcgacgccct 


gctcgccggc 


cccaaacccg 


ccgatctcgt 


ccaggcgctt tccctgccgg 


420 


. ttccgtcctt 


ggtgatctgc 


gaactgctcg 


gtgtccccta 


ttcggaccac gagttcttcc 


480 


agtcctgcag 


ttcccggatg 


ctcagccggg 


aagtcaccgc 


cgaagaacgg atgaccgcgt 


540 


tcgagtcgct 


cgagaactat 


ctcgacgaac 


tcgtcacgaa 


gaaggaggcg aacgccaccg 


600 


aggacgacct 


cctcggccgc 


cagatcctga 


agcagcgcga 


atccggcgaa gccgaccacg 


660 


gcgaactggt 


cggtctggcg 


ttcctcctgc 


tcatcgcggg 


gcacgagact acggcgaaca 


720 


tgatctcgct 


cggcacggtg 


accctgctgg 


agaaccccga 


tcagctggcg aagatcaagg 


780 


cggatccggg 


caagaccctc 


gccgcgatcg 


aggaactcct 


gcggatcttc accatcgcgg 


840 


agacggcgac 


ctcacgcttc 


gccacggcgg 


acgtcgagat 


cggcggcacg ctcatccgcg 


900 


cgggtgaagg 


cgtcgtcggc 


ctgagcaacg 


cgggcaacca 


cgatccggac ggcttcgaga 


960 


acccggacac 


cttcgacatc 


gaacgcggcg 


cgcggcatca 


cgtcgccttc. ggattcggtg 


1020 


tgcaccaatg 


cctcggccag 


aacttggcga 


ggttggaact 


ccagatcgtg ttcgatacgt 


1080 


tgttccggcg 


agtgccgggc 


atccggatcg 


ccgtaccggt 


cgacgaactg ccgttcaagc 


1140 


acgattcgac 


gatctacggc 


ctccgcgccc 


tgccggtcac 


ctggtaggag gagccatgaa 


12 0 0 


gatcatcgcg 


gacaccggga 


agtgcgtggg 


ggcgggccag 


tgcgtgctca ccgatcccga 


1260 


tctgttcgac 


cagagcgagg 


acgacgggac 


ggtcctcctg 


ctga 


1304 


<210> 41 
<211> 825 
<212> DNA 

<213> Amycolatopsis orientalis 








<400> 41 
ctccggtcaa 


accgcttggg cgctcacccg 


gctcgaagac 


atccgcgaaa tgctgagcag 


60 


tccgcatttc 


agctccgacc 


ggcagaatcc 


gtcgttcccg 


ctgatggtgg cgcggcagat 


120 


ccggcgcgag gacaagccgt 


tccgcccgtc 


cctcatcgcg 


atggacccgc cggaacacag 


180 


caaggccagg cgtgacgtcg tcggggaatt 


caccgtcaag 


cgcatgaaag cgcttcagcc 


240 


gcgtattcag cagatcgtcg acgagcatat 


cgacgccctg 


ctcgccggcc ccaaacccgc 


300 


cgatctcgtc caggcgcttt 


ccctgccggt 


tccgtccttg 


gtgatctgcg aactgctcgg 


360 


tgtcccctat 


tcggaccacg agttcttcca 


gtcctgcagt 


tcccggatgc tcagccggga 


420 
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agtcaccgcc gaagaacgga tgaccgcgtt cgagtcgctc gagaactatc tcgacgaact 480 

cgtcacgaag aaggaggcga acgccaccga ggacgacctc ctcggccgcc agatcctgaa • 540 

gcagcgggaa acgggcgagg ccgaccacgg cgaactcgtc gggctggcgt tcctgctgct 600 

catcgccggg cacgagacga cggcgaacat gatctcgctc ggcacggcga ccctgctgga 660 

gaaccccgac cagctggcga agatcaaggc ggatccgggc aagaccctcg ccgcgatcga 720 

ggaactgctg cgcgtcttca cgatcgcgga gacggcgacc tcacgcttcg ccacggcgga 780 

cgtcgagatc ggcggcacgc tcatccgcgc gggtgaaggc gtcgt 825 

<210> 42 
<211> 1103 
<212> DNA 

<213> Amycolatopsis orientalis 
<400> 42 

gcgaccttgc cactggcccg caaatgcccg ttttcaccac cgcccgaata cgagcggctc 60 

cgccgggaaa gtccggtttc ccgggtcggt ctcccctccg gtcaaaccgc ttgggcgctc 120 

acccggctcg aagacatccg cgaaatgctg agcagtccgc atttcagctc cgaccggcag 18 0 

agtccgtcgt tcccgctgat ggtggcgcgg cagatccggc gcgaggacaa gccgttccgc 240 

ccgtccctca tcgcgatgga cccgccggaa cacggcaagg ccaggcgtga cgtcgtcggg 300 

gaattcaccg tcaagcgcat gaaagcgctt cagccacgta ttcagcagat cgtcgacgag 3 60 

catatcgacg ccctgctcgc cggccccaaa cccgccgatc tcgtccaggc gctttccctg 42 0 

ccggttccgt ccttggtgat ctgcgaactg ctcggtgtcc cctattcgga ccacgagt'tc 480 

ttccagtcct gcagttcccg gatgctcagc cgggaagtca ccgccgaaga acggatgacc . 540 

gcgttcgagt cgctcgagaa ctatctcgac gaactcgtca cgaagaagga ggcgaacgcc 600 

accgaggacg acctcctcgg ccgccagatc ctgaagcagc gcgaatccgg cgaagccgac 660 

cacggcgaac tggtcggtct ggcgttcctc ctgctcatcg cggggcacga gactacggcg 720 

aacatgatct cgctcggcac ggtgaccctg ctggagaacc ccgatcagct ggcgaagatc 78 0 

aaggcggatc cgggcaagac cctcgccgcg atcgaggaac tcctgcggat cttcaccatc 840 

gcggagacgg cgacctcacg cttcgccacg gcggacgtcg agatcggcgg cacgctcatc 900 

cgcgcgggtg aaggcgtcgt. cggcctgagc aacgcgggca accacgatcc ggacggcttc 960 

gagaacccgg acaccttcga catcgaacgc ggcgcgcggc atcacgtcgc cttcggattc 102 0 

ggtgtgcacc aatgcctcgg ccagaacttg gcgaggttgg aactccagat cgtgttcgat 1080 
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acgttgttcc ggcgagtgcc ggg 1103 



<210> 43 
<211> .402 
<212> PRT 

<213> Amycolatopsis orientalis 
<400> 43 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg. Lys 
15 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin lie 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu He Ala Met Asp Pro 
85 90 95 

Pro Glu His Ser Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 

100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg He Gin Gin He Val Asp Glu 
115 120 125 

His He Asp Ala Met Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 
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Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 . 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Thr 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Phe Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala . Asn Met He Ser Leu Gly Thr Ala 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg Val Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser. Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 ~ 320 



Gly Asn His Asp Pro Glu Gly Phe Glu Asn Pro Asp Ala Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr lie Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val 
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<210> 44 

<211> 367 

<212> PRT . 

<213> Amycolatopsis orientalis 

<400> 44 

Thr Leu Pro Leu Ala Arg Lys Cys Pro Phe Ser Pro Pro Pro Glu Tyr 
15 10 15 



Glu Arg Leu Arg Arg Glu Ser Pro Val Ser Arg Val Gly Leu Pro Ser 
20 25 30 



Gly Gin Thr Ala Trp Ala Leu Thr Arg Leu Glu Asp lie Arg Glu Met 
35 40 45 



Leu Ser Ser Pro His Phe Ser Ser Asp Arg Gin Ser Pro Ser Phe Pro 
50 55 60 



Leu Met Val Ala Arg Gin lie Arg Arg Glu Asp Lys Pro Phe Arg Pro 
65 70 75 80 



Ser Leu lie Ala Met Asp Pro Pro Glu His Ser Arg Ala Arg Arg Asp 
85 90 95 



Val Val Gly Glu Phe Thr Val Lys Arg Met Lys Ala Leu Gin Pro Arg 
100 105 110 



lie Gin Gin lie Val Asp Glu His Leu Asp Ala Leu Leu Ala Gly Pro 

' 115 . 120 ... 125 



Lys Pro Ala Asp Leu Val Gin Ala Leu Ser Leu Pro Val Pro Ser Leu 
130 135 140 



Val lie Cys Glu Leu Leu Gly Val Pro Tyr Ser Asp His Glu Phe Phe 
145 150 155 160 



Gin Ser Cys Ser Ser Arg Met Leu Ser Arg Glu Val Thr Ala Glu Glu 
. 165 170 175 . 



Arg Met Thr Ala Phe Glu Gin Leu Glu Asn Tyr Leu Asp Glu Leu Val 
180 185 190 



Thr Lys Lys Glu Ala Asn Ala Thr Glu Asp Asp Leu Leu Gly Arg Gin 
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195 200 205 



He Leu Lys Gin Arg Glu Thr Gly Glu Ala Asp His Gly Glu Leu Val 
210 215 220 



Gly Leu Ala Phe Leu Leu Leu He Ala Gly His Glu Thr Thr Ala Asn 
225 230 235 240 



Met He Ser Leu Gly Thr Val Thr Leu Leu Glu Asn Pro Asp Gin Leu 
245 250 A 255 



Ala Lys He Lys Ala Asp Pro Gly Lys Thr Leu Ala Ala He Glu Glu 
260 265 270 



Leu Leu Arg Val Phe Thr He Ala Glu Thr Ala Thr Ser Arg Phe Ala 
275 280 285 



Thr Ala Asp Val Glu He Gly Gly Thr Leu He Arg Ala Gly Glu Gly 
290 295 300 



Val Val Gly Leu Ser Asn Ala Gly Asn His Asp Pro Asp Gly Phe Glu 
305 310 315 ~ ~ 320 



Asn Pro Asp Thr Phe Asp lie Glu Arg Gly Ala Arg His His Val Ala 
325 330 335 

Phe Gly Phe Gly Val His Gin Cys Leu Gly Gin Asn Leu Ala Arg Leu 
340 345 350 

Glu Leu Gin He Val Phe Asp Thr Leu Phe Arg Arg Val Pro Gly 
355 360 ~ 365 



<210> 45 

<211> 272 

<212> PRT 

<213> Amycolatopsis 



orientalis 



<400> 45 

Ser Gly Gin Thr Ala Trp Ala Leu Thr Arg Leu Glu Asp He Arg Glu 
1 5 10 15 



Met Leu Ser Ser Pro His Phe Ser Ser Asp Arg Gin Ser Pro Ser Phe 
20 25 ~ 30 
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Pro Leu Met Val Ala Arg Gin lie Arg Arg Glu Asp Lys Pro Phe Arg 
35 40 45 



Pro Ser Leu lie Ala Met Asp Pro Pro Glu His Gly Lys Ala Arg Arg 
50 55 60 



Asp Val Val Gly Glu Phe Thr Val Lys Arg Met Lys Ala Leu Gin Pro 
65 70 75 80 



Arg lie Gin Gin lie Val Asp Glu His He Asp Ala Leu Leu Ala Gly 
85 90 95 



Pro Lys Pro Ala Asp Leu Val Gin Ala Leu Ser Leu Pro Val Pro Ser 
100 105 110 



Leu Val He Cys Glu Leu Leu Gly Val Pro Tyr Ser Asp His Glu Phe 
115 120 125 



Phe Gin Ser Cys Ser Ser Arg Met Leu Ser Arg Glu Val Thr Ala Glu 
130 135 140 



Glu Arg Met Thr Ala Phe Glu Ser Leu Glu Asn Tyr Leu Asp Glu Leu 
145 150 155 160 



Val Thr Lys Lys Glu Ala Asn Ala Thr Glu Asp Asp Leu Leu Gly Arg 
165 170 175 



Gin lie. Leu .Lys. Gin Arg Glu Ser Gly Glu. Ala Asp His Gly Glu Leu 
180 185 190. 



Val Gly Leu Ala Phe Leu Leu Leu He Ala Gly His Glu Thr Thr Ala 
195 200 205 



Asn Met He Ser Leu Gly Thr Val Thr Leu Leu Glu Asn Pro Asp Gin 
210 215 220 



Leu Ala Lys He Lys Ala Asp Pro Gly Lys Thr Leu Ala Ala He Glu 
225 230 235 240 



Glu Leu Leu Arg He Phe Thr He Ala Glu Thr Ala Thr Ser Arg Phe 
245 250 255 
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Ala Thr Ala Asp Val Glu He Gly Gly Thr Leu He Arg Ala Gly Glu 
260 265 270 



<210> 46 

<211> 367 

<212> PRT 

<213> Amycolatopsis orientalis 

<400> 46 

Ala Thr Leu Pro Leu Ala Arg Lys Cys Pro Phe Ser Pro Pro Pro Glu 
1 5 10 15 



Tyr Glu Arg Leu Arg Arg Glu Ser Pro Val Ser Arg Val Gly Leu Pro 
20 25 30 



Ser Gly Gin Thr Ala Trp Ala Leu Thr Arg Leu Glu Asp He Arg Glu 
35 40 45 



Met Leu Ser Ser Pro His Phe Ser Ser Asp Arg Gin Ser Pro Ser Phe 
50 55 * 60 



Pro Leu Met Val Ala Arg Gin He Arg Arg Glu Asp Lys Pro Phe Arg 
65 70 75 80 



Pro Ser Leu He Ser Met Asp Pro Pro Glu His Ser Lys Ala Arg Arg 
85 90 95 



Asp Val Val Gly Glu Phe Thr Val Lys Arg Met Lys Ala Leu Gin Pro 
100 105 110 



Arg He Gin Gin He Val Asp Glu His He Asp Ala Leu Leu Ala Gly 
H5 120 125 



Pro Lys Pro Ala Asp Leu Val Gin Ala Leu Ser Leu Pro Val Pro Ser 
130 135 140 



Leu Val He Cys Glu Leu Leu Gly Val Pro Tyr Ser Asp His Glu Phe 
145 150 155 160 



Phe Gin Ser Cys Ser Ser Arg Met Leu Ser Arg Glu Val Thr Ala Glu 
165 170 175 



Glu Arg Met Thr Ala Phe Glu Ser Leu Glu Asn Tyr Leu Asp Glu Leu 
180 185 190 
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Val Thr Lys Lys Glu Ala Asn Ala Thr Glu Asp Asp Leu Leu Gly Arg 
195 200 205 



Gin lie Leu Lys Gin Arg Glu Thr Gly Glu Ala Asp His Gly Glu Leu 
210 215 220 



Val Gly Leu Ala Phe Leu Leu Leu He Ala Gly His Glu Thr Thr Ala 
225 230 235 240 



Asn Met He Ser Leu Gly Thr Ala Thr Leu Leu Glu Asn Pro Asp Gin 
245 250 255 



Leu Ala Lys He Lys Ala Asp Pro Gly Lys Thr Leu Ala Ala lie Glu 
260 265 270 



Glu Leu Leu Arg Val Phe Thr He Ala Glu Thr Ala Thr Ser Arg Phe 
275 280 285 



Ala Thr Ala Asp Val Glu He Gly Gly Thr Leu lie Arg Ala Gly Glu 
290 295 300 



Gly Val Val Gly Leu Ser Asn Ala Gly Asn His Asp Pro Glu Gly Phe 
305 310 315 " " 320 



Glu Asn Pro Asp Ala Phe Asp He Glu Arg Gly Ala Arg His His Val 
325 330 ~ 335 



Ala Phe Gly Phe Gly Val His Gin Cys Leu Gly Gin Asn Leu Ala Arg 
340 345 350 



Leu Glu Leu Gin He Val Phe Asp Thr Leu Phe Arg Arg Val Pro 
355 360 365 



<210> 47 

<211> 394 

<212> PRT 

<213> Amycolatopsis orientalis 

<400> 47 

Leu Pro Leu Ala Arg Lys Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu 
1 5 io 15 
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Arg Leu Arg Arg Glu Ser Pro Val Ser Arg Val Gly Leu Pro Ser Gly 
20 25 30 



Gin Thr Ala Trp Ala Leu Thr Arg Leu Glu Asp lie Arg Glu Met Leu 
35 40 45 



Ser Ser Pro His Phe Ser Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu 
50 55 60 



Met Val Ala Arg Gin lie Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser 
65 70 75 80 



Leu lie Ala Met Asp Pro Pro Glu His Gly Lys Ala Arg Arg Asp Val 
85 90 95 



Val Gly Glu Phe Thr Val Lys Arg Met Lys Ala Leu Gin Pro Arg lie 
100 105 110 



Gin Gin lie Val Asp Glu His lie Asp Ala Leu Leu Ala Gly Pro Lys 
115 ~ 120 125 



Pro Ala Asp Leu Val Gin Ala Leu Ser Leu Pro Val Pro Ser Leu Val 
130 135 140 



lie Cys Glu Leu Leu Gly Val Pro Tyr Ser Asp His Glu Phe Phe Gin 
145 150 155 160 



Ser Cys Ser Ser Arg Met Leu Ser Arg Glu Val Thr Ala Glu Glu Arg 

- •- 165 170 - 175 



Met Thr Ala Phe Glu Ser Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr 
180 185 190 



Lys Lys Glu Ala Asn Ala Thr Glu Asp Asp Leu Leu Gly Arg Gin lie 
195 200 205 



Leu Lys Gin Arg Glu Ser Gly Glu Ala Asp His Gly Glu Leu Val Gly 
210 215 220 



Leu Ala Phe Leu Leu Leu lie Ala Gly His Glu Thr Thr Ala Asn Met 
225 230 235 240 



lie Ser Leu Gly Thr Val Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala 
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245 250 255 



Lys lie Lys Ala Asp Pro Gly Lys Thr Leu Ala Ala lie Glu Glu Leu 
260 265 270 



Leu Arg lie Phe Thr lie Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr 
275 280 285 



Ala Asp Val Glu lie Gly Gly Thr Leu He Arg Ala Gly Glu Gly Val 
290 295 300 



Val Gly Leu Ser Asn Ala Gly Asn His Asp Pro Asp Gly Phe Glu Asn 
305 310 315 320 



Pro Asp Thr Phe Asp He Glu Arg Gly Ala Arg His His Val Ala Phe 
325 330 335 



Gly he Gly Val His Gin Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu 
340 345 350 



Leu Gin He Val Phe Asp Thr Leu Phe Arg Arg Val Pro Gly He Arg 
355 360 365 



He Ala Val Pro Val Asp Glu Leu Pro Phe Lys His Asp Ser Thr He 
370 375 380 



Tyr Gly Leu Arg Ala Leu Pro Val Thr Trp 



385 




390 


<210> 


48 




<211> 


274 




<212> 


PRT 




<213> 


Amycolatopsis 


3 orientalis 


<400> 


48 





Ser Gly Gin Thr Ala Trp Ala Leu Thr Arg Leu Glu Asp He Arg Glu 
15 10 15 



Met Leu Ser Ser Pro His Phe Ser Sr.- Asp Arg Gin Asn Pro Ser Phe 
20 2b 30 



Pro Leu Met Val Ala Arg Gin He Arg Arg Glu Asp Lys Pro Phe Arg 
35 40 45 
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Pro Ser Leu lie Ala Met Asp Pro Pro Glu His Ser Lys Ala Arg Arg 
50 55 60 



Asp Val Val Gly Glu Phe Thr Val Lys Arg Met Lys Ala Leu Gin Pro 
65 70 75 80 



Arg lie Gin Gin lie Val Asp Glu His lie Asp Ala Leu Leu Ala Gly 
85 90 95 



Pro Lys Pro Ala Asp Leu Val Gin Ala Leu Ser Leu Pro Val Pro Ser 
100 105 110 



Leu Val He Cys Glu Leu Leu Gly Val Pro Tyr Ser Asp His Glu Phe 
115 120 ' 125 



Phe Gin Ser Cys Ser Ser Arg Met Leu Ser Arg Glu Val Thr Ala Glu 
130 135 140 



Glu Arg Met Thr Ala Phe Glu Ser Leu Glu Asn Tyr Leu Asp Glu Leu 
145 150 155 160 



Val Thr Lys Lys Glu Ala Asn Ala Thr Glu Asp Asp Leu Leu Gly Arg 
165 170 175 



Gin He Leu Lys Gin Arg Glu Thr Gly Glu Ala Asp His Gly Glu Leu 
180 185 190 



..Val Gly Leu Ala Phe Leu Leu Leu lie. Ala Gly His Glu Thr -Thr Ala ~ 
195 - 200 205 



Asn Met He Ser Leu Gly Thr Ala Thr Leu Leu Glu Asn Pro Asp Gin 
210 215 220 



Leu Ala Lys He Lys Ala Asp Pro Gly Lys Thr Leu Ala Ala He Glu 
225 230 235 240 



Glu Leu Leu Arg Val Phe Thr He Ala Glu Thr Ala Thr Ser Arg Phe 
245 250 255 



Ala Thr Ala Asp Val Glu lie Gly Gly Thr Leu He Arg Ala Gly Glu 
260 265 270 
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Gly Val 



<210> 49 

<211> 367 

<212> PRT 

<213> Amycolatopsis orientalis 

<400> 49 



Ala Thr Leu Pro Leu Ala Arg Lys Cys Pro Phe Ser Pro Pro Pro Glu 
1 5 10 15 



Tyr Glu Arg Leu Arg Arg Glu Ser Pro Val Ser Arg Val Gly Leu Pro 
20 25 30 



Ser Gly Gin Thr Ala Trp Ala Leu Thr Arg Leu Glu Asp lie Arg Glu 
35 40 45 



Met Leu Ser Ser Pro His Phe Ser Ser Asp Arg Gin Ser Pro Ser Phe 
50 55 60 



Pro Leu Met Val Ala Arg Gin lie Arg Arg Glu Asp Lys Pro Phe Arg 
65 70 75 80 



Pro Ser Leu He Ala Met Asp Pro Pro Glu His Gly Lys Ala Arg Arg 
85 90 95 



Asp Val Val Gly Glu Phe Thr Val Lys Arg Met Lys Ala Leu Gin Pro 
100 105 110 



Arg He Gin Gin He Val Asp Glu His lie Asp Ala Leu Leu Ala Gly 
115 120 125 



Pro Lys Pro Ala Asp Leu Val Gin Ala Leu Ser Leu Pro Val Pro Ser 
130 135 140 



Leu Val He Cys Glu Leu Leu Gly Val Pro Tyr Ser Asp His Glu Phe 
145 150 155 " 160 



Phe Gin Ser Cys Ser Ser Arg Met Leu Ser Arg Glu Val Thr Ala Glu 
165 170 175 



Glu Arg Met Thr Ala Phe Glu Ser Leu Glu Asn Tyr Leu Asp Glu Leu 
180 185 190 
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Val Thr Lys Lys Glu Ala Asn Ala Thr Glu Asp Asp Leu Leu Gly Arg 
195 200 205 



Gin He Leu Lys Gin Arg Glu Ser Gly Glu Ala Asp His Gly Glu Leu 
210 215 220 



Val Gly Leu Ala Phe Leu Leu Leu He Ala Gly His Glu Thr Thr Ala 
225 230 235 240 



Asn Met He Ser Leu Gly Thr Val Thr Leu Leu Glu Asn Pro Asp Gin 
245 250 255 



Leu Ala Lys He Lys Ala Asp Pro Gly Lys Thr Leu Ala Ala He Glu 
260 265 270 



Glu Leu Leu Arg He Phe Thr He Ala. Glu Thr Ala Thr Ser Arg Phe 
275 280 285 



Ala Thr Ala Asp Val Glu lie Gly Gly Thr Leu He Arg Ala Gly Glu 
290 295 300 



Gly Val Val Gly Leu Ser Asn Ala Gly Asn His Asp Pro Asp Gly Phe 
305 310 315 320 



Glu Asn Pro Asp Thr Phe Asp He Glu Arg Gly Ala Arg His His Val 
325 330 335 



Ala Phe Gly Phe Gly Val His Gin Cys Leu Gly Gin Asn Leu Ala Arg 
340 345 350 



Leu Glu Leu Gin lie Val Phe Asp Thr Leu Phe Arg Arg Val Pro 
355 360 365 



<210> 


50 


<211> 


25 


<212> 


DNA 


<213> 


Artificial 


<220> 




<223> 


Synthetic 


<400> 


50 



aggaaaccac cgcgaccttg ccact 25 
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<210> 51 

<211> 25 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 51 

accgaatccg aaggcgacgt gatgc 25 



<210> 52 

<211> 23 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 52 

cggaatgaat ccatccgcat acg 23 

<210> 53 

<211> 23 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 53 

tgatcttcat ggctcctcct acc 23 



<210> 54 - 

*<211> 35 

• <212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<220> 

<221> misc_f eature 

<222> (18) . . (20) 

<223> n=a # . c, g or t 



<400> 54 

gcgaagccga ccacggcnnn ctggtcggtc tggcg 35 



<210> 55 
<211> 35 
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<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<220> 

<22l> miscjfeature 

<222> (16) . . (18) 

<223> n=a, c, g or t 



<400> 55 

cgccagaccg accagnnngc cgtggtcggc ttcgc 35 



<210> 


56 


<211> 


35 


<212> 


DNA 


<213> 


Artificial sequence 


<220> 




<223> 


Synthetic 


<220> 




<221> 


misc feature 


<222> 


(14).. (14) 


<223> 


n=a, c, g or t 


<400> 


56 



ggtcggtctg gcgnysctcc tgctcatcgc ggggc 35 



<210> 


57 






<211> 


35 






<212> 


DNA 






<213>__ 


Artificial sequence ... 






<220> 








<223> 


Synthetic 






<220> 








<221> 


misc feature 






<222> 


(22) . . (22) 






<223> 


n=a, c, g or t 






<400> 


57 







gccccgcgat gagcaggags rncgccagac cgacc 35 

<210> 58 

<211> 35 

<212> DNA 

<213> Artificial sequence 
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<220> 

<223> Synthetic 



misc_f eature 
(17) . . (17) 
n=a, c, g or t 



<400> 58 

ggtcggtctg gcgttcnysc tgctcatcgc ggggc 35 



<210> 59 

<211> 35 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<220> 

<221> misc_f eature 

<222> (19) . . (19) 

<223> n=a,. c, g or t 

<400> 59 

gccccgcgat gagcagsrng aacgccagac cgacc 35 

<210> 60 

<211>, 1215 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic . . . . . 



<400> 60 
atgaccgacg 


tcgaggaaac 


caccgcgacc 


ttgccactgg cccgcaaatg cccgttttca 


60 


ccaccgcccg 


aatacgagcg 


gctccgccgg gaaagtccgg tttcccgggt 


cggtctcccc 


120 


tccggtcaaa 


ccgcttgggc 


gctcacccgg 


ctcgaagaca tccgcgaaat 


gctgagcagt 


180 


ccgcatttca 


gctccgacca 


gcagagtccg 


tcgttcccgc tgatggtggc 


gcggcagatc 


240 


cggcgcgagg 


acaagccgtt 


ccgcccgtcc 


ctcgtcgcga tggacccgcc 


ggaacacggc 


300 


aaggccaggc 


gtgacgtcgt 


cggggaattc 


accgtcaagc gcatgaaagc 


gcttcagcca 


360 


cgtattcagc 


agatcgtcga 


cgagcatatc 


gacgccctgc tcgccggccc 


caaacccgcc 


420 


gatctcgtcc 


aggcgctttc 


cctgccggtt 


ccgtccttgg tgatctgcga 


actgctcggt 


480 


gtcccctatt 


cggaccacga 


gttcttccag tcctgcagtt cccggatgct 


cagccgggaa 


540 
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gtcaccgccg 


aagaacggat gaccgcgttc gagtcgctcg agaactatct cgacgaactc 


600 


gtcacgaaga 


aggaggcgaa cgccaccgag gacgacctcc tcggccgcca gatcctgaag 


660 


cagcgcgaat 


ccggcgaagc cgaccacggc gaactggtcg gtctggcggc gctcctgctc 


720 


atcgcggggc 


acgagactac ggcgaacatg atctcgctcg gcacggtgac cctgctggag 


780 


aaccccgatc 


agctggcgaa gatcaaggcg gatccgggca agaccctcgc cgcgatcgag 


840 


gaactcctgc 


ggatcttcac catcgcggag acggcgacct cacgcttcgc cacggcggac 


900 


gtcgagatcg 


gcggcacgct catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 


960 


ggcaaccacg 


atccggacgg cttcgagaac ccggacacct tcgacatcga acgcggcgcg 


1020 


cggcatcacg 


tcgccttcgg attcggtgtg caccaatgcc tcggccagaa cttggcgagg 


1080 


ttggaactcc 


agatcgtgtt cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 


1140 


gtaccggtcg 


acgaactgcc gttcaagcac gattcgacga tctacggcct ccacgccctg 


1200 


ccggtcacct 


ggtag 


1215 



<210> 61 
<211> 404 
<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 61 

Met Thr Asp Val Gin Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 

1 _ .5 10 .15 

Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 

Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 ■ 45 

Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 

Ser Asp Gin Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin lie 
65 70 75 80 

Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu Val Ala Met Asp Pro 
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85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg He Gin Gin He Val Asp Glu 
115 120 125 



His He Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 * 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala lie Glu Glu Leu Leu Arg He Phe Thr lie 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu lie Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 
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Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 

Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 " 350 

Cys Leu Gly Gin Asn Leu Ala Arg Lett Glu Leu Gin He Val* Phe Asp 
355 360 365 

Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 -380 

Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 

Pro Val Thr Trp 



<210> 62 
<211> 1215 
<212> DNA 

<213> Artificial sequence 
<22 0> 

<223> Synthetic 
<400> 62 " 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg cccgcaaatg cccgttttca 60 
..ccaccgcccg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 
tccggtcaaa ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 180 
ccgcatttca gctccgaccg gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 
cggcgcgagg acaagccgtt ccgcccgtcc ctcgtcggga tggacccgcc ggaacacggc 300 
aaggccaggc gtgacgtcgt cggggaattc accgtcaagc gcatgaaagc gcttcagcca 360 
cgtattcagc agatcgtcga cgagcatatc gacgccctgc tcgccggccc caaacccgcc 42 0 
gatctcgtcc aggcgctttc cctgccggtt ccgtccttgg tgatctgcga actgctcggt 480 
gtcccctatt cggaccacga gttcttccag tcctgcagtt cccggatgct cagccgggaa 540 
gtcaccgccg aagaacggat gaccgcgttc gagtcgctcg agaactatct cgacgaactc 600 
gtcacgaaga aggaggcgaa cgccaccgag gacgacctcc tcggccgcca gatcctgaag 660 
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ccggcgaagc 


cgaccacggc gaactggtcg gcccggcggc gcccctgctc 


1 £\j 


atcy CCjCjyy C 


acgagac cac 


ggcgaacatg atctcgcccg gcacggtgac cctgctggag 


•7 0 n 


aaccccgacc 


agctggcgaa 


gatcaaggcg gatccgggca agaccctcgc cgcgatcgag 


0 a n 
840 


gaactcctgc 


ggatcttcac 


catcgcggag acggcgacct cacgcttcgc cacggcggac 


900 


gtcgagatcg 


gcggcacgct 


catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 


960 


ggcaaccacg 


atccggacgg 


cttcgagaac ccggacacct tcgacatcga acgcggcgcg 


1020 


cggcatcacg 


tcgccttcgg 


attcggtgtg caccaatgcc tcggccagaa cttggcgagg 


1080 


ttggaactcc 


agaccgtgtt 


cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 


1140 


gtaccggtcg 


acgaactgcc 


gttcaagcac gattcgacga tctacggcct ccacgccctg 


1200 


ccggtcacct 


ggtag 




1215 



<210> 63 

<211> 404 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 63 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
1 5 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin lie 
65 - 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu Val Gly Met Asp Pro 
85 . 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 " 110 
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Lys Arg Met Lys Ala Leu Gin Pro Arg lie Gin Gin He Val Asp Glu 
115 120 125 



His lie Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu Leu Leu 
225 230 . 235 240 



He Ala Gly His Glu Thr- Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu lie Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn. Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 
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Glu Arg Gly Ala Arg His His Val Ala Phe Gly 
340 345 



Phe Gly Val His Gin 
350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu 
355 360 



Gin Thr Val Phe Asp 
365 



Thr Leu Phe Arg Arg Val Pro Gly lie Arg lie 
370 375 



Ala Val Pro Val Asp 
380 



- Glu Leu Pro Phe Lys His Asp Ser Thr lie Tyr Gly Leu His Ala Leu 



Pro Val Thr Trp 



<210> 64 
<211> 1215 
<212> DNA 

<213> Artificial sequence 








<220> 

<223> Synthetic 










<400> 64 
atgaccgacg 


tcgaggaaac 


caccgcgacc ttgccactgg 


cccgcaaatg 


cccgttttca 


60 


ccaccgcccg 


aatacgagcg 


gctccgccgg gaaagtccgg 


tttcccgggt 


cggtctcccc 


120 


tccggtcaaa 


ccgcttgggc 


gctcacccgg ctcgaagaca 


tccgcgaaat 


gctgagcagt 


180 


ccgcatttca 


gctccgaccg 


gcagagtccg tcgttcccgc 


tgatggtggc 


gcggcagatc 


240 


cggcgcgagg 


acaagccgtt 


ccgcccgtcc ctcgtcgcga 


tggacccgcc 


ggaacacggc 


300 


aaggccaggc 


gtgacgccgt 


cggggaattc accgtcaagc 


gcatgaaagc 


gcttcagcca 


360 


cgtattcagc 


agatcgtcga 


cgagcatatc gacgccctgc 


tcgccggccc 


caaacccgcc 


420 


gatctcgtcc 


aggcgctttc 


cctgccggtt ccgtccttgg 


tgatctgcga 


actgctcggt 


480 


gtcccctatt 


cggaccacga 


gttcttccag tcctgcagtt 


cccggatgct 


cagccgggaa 


540 


gtcaccgccg 


aagaacggat 


gaccgcgttc gagtcgctcg 


agaactatct 


cgacgaactc 


600 


gtcacgaaga 


aggaggcgaa 


cgccaccgag gacgacctcc 


tcggccgcca 


gatcctgaag 


660 


cagcgcgaat 


ccggcgaagc 


cgaccacggc gaactggtcg 


gtctggcggc 


gctcctgctc 


720 


atcgcggggc 


acgagactac 


ggcgaacatg atctcgctcg 


gcacggtgac 


cctgctggag 


780 


aaccccgatc 


agctggcgaa 


gatcaaggca gatccgggca 


agaccctcgc 


cgcgatcgag 


840 



385 



390 



395 
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gaactcctgc 


ggatcttcac 


catcgcggag acggcgacct cacgcttcgc cacggcggac 


900 


gtcgagatcg 


gcggcacgct 


catccgcgcg ggtgaaggcg tcgtcggcct 


gagcaacgcg 


960 


ggcaaccacg 


atccggacgg 


cttcgagaac ccggacacct tcgacatcga 


acgcggcgcg 


1020 


cggcafccacg 


tcgccttcgg 


attcggtgtg caccaatgcc tcggccagaa 


cttggcgagg 


108 0 


ttggaactcc 


agatcgtgtt 


cgatacgttg ttccggcgag tgccgggcat 


ccggatcgcc 


1140 


gtacccggtcg 


acgaactgcc 


gttcaagcac gattcgacga tctacggcct 


ccacgccctg 


1200 


ccggtcacct 


ggtag 






1215 



<210> 65 
<211> 404 
<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 65 

Met Ttir Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
15 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin lie 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu Val Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Ala Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg He Gin Gin He Val Asp Glu' 
115 120 125 
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His lie Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val lie Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 . 



Thr Glu Asp Asp Leu Leu Gly Arg Gin lie Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu Leu Leu 
225 230 235 240 



lie Ala Gly His Glu Thr Thr Ala Asn Met lie Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys lie Lys Ala Asp Pro 
260 265 270 



Gly_ Lys Thr Leu Ala Ala lie Glu Glu Leu. Leu Arg, lie Phe Thr lie 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu lie Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 
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Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 

Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
3 70 375 380 

Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 

Pro Val Thr Trp 



<210> 66 
<211> 1215 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 66 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg ctcgcaaatg cccgttttca 60 

ccaccgcccg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 

tccggtcaaa ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 180 

ccgcafcttca gctccgaccg gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 

cggcgcgagg,; acaagccgtt ccacccgtcc ctcgtcgcga tggacccgcc ggaacacggc 3 00 

aaggccaggc gtgacgtcgt cggggaattc accgtcaagc gcatgaaagc gcttcagcca 360 

cgtattcagc agatcgtcga cgagcatatc gacgccctgc tcgccggccc caaacccgcc -420 - - 

gatctcgtcc aggcgctttc cctgccggtt ccgtccttgg tgatctgcga actgctcggt 480 

gtcccctatt cggaccacga gttcttccag tcctgcagtt cccggatgct cagccgggaa 540 

gtcaccgccg aagaacggat gaccgcgttc gagtcgctcg agaactatct cgacgaactc 600 

gtcacgaaga aggaggcgaa cgccaccgag gacgacctcc tcggccgcca gatcctgaag 660 

cagcgcgaat ccggcgaagc cgaccacggc gaactggtcg gtctggcggc gctcctgctc 720 

atcgcggggc acgagactac ggcgaacatg atctcgctcg gcacggtgac cctgctggag 780 

aaccccgatc agctggcgaa* gatcaaggcg gatccgggca agaccctcgc cgcgatcgag 840 

gaactcctgc ggatcttcac catcgcggag acggcgacct cacgcttcgc cacggcggac 900 

gtcgagatcg gcggcacgct catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 960 
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ggcaaccacg atccggacgg cttcgagaac ccggacacct tcgacatcga acgcggcgcg 1020 

cggcatcacg tcgccttcgg attcggtgtg caccaatgcc tcggccagaa cttggcgagg 1080 

ttggaactcc agatcgtgtt cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 1140 

gtaccggtcg acgaactgcc gttcaagcac gattcgacga tctacggcct ccacgccctg 1200 

ccggtcacct ggtag 1215 



<210> 


67 


<211> 


404 


<212> 


PRT 


<213> 


Artificial 


<220> 




<223> 


Synthetic 


<400> 


67 



Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
15 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 ~ 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg. Gin. Ser Pro. Ser. Phe Pro, Leu Met Val Ala Arg Gin. He 
65 70 - 75 80 



Arg Arg Glu Asp Lys Pro Phe His Pro Ser Leu Val' Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg II- Jin Gin He Val Asp Glu 
115 120 125 



His lie. Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 
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Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val P:ro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Ash Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 A 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 ~ 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu lie Gly 

290 ..... . _ 295 .. . 300 - - - - 



Gly Thr Leu lie Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 ~ 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp lie 
325 330. 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
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370 375 380 

Glu Leu Pro Phe Lys His Asp Ser Thr lie Tyr Gly Leu His Ala Leu 
385 390 395 400 

Pro Val Thr Trp 

<210> 68 

<211> 1215 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 68 



atgaccgacg 


tcgaggaaac 


caccgcgacc 


ttgccactgg 


cccgcaaatg 


cccgttttca 


60 


ccaccgcccg 


aatacgagcg 


gctccgccgg 


aaaagtccgg 


tttcccgggt 


cggtctcccc 


120 


tccggtcaaa 


ccgcttgggc 


gctcacccgg 


ctcgaagaca 


tccgcgaaat 


gctgagcagt 


180 


ccgcatttca 


gctccgaccg 


gcagagtccg 


tcgttcccgc 


tgatggtggc 


gcggcagatc 


240 


cggcgcgagg 


acaagccgtt 


ccgcccgtcc 


ctcatcgcga 


tggacccgcc 


ggaacacggc 


300 


aaggccaggc 


gtgacgtcgt 


cggggaattc 


accgtcaagc 


gcatgaaagc 


gcttcagcca 


360 


cgtattcagc 


agatcgtcga 


cgagcatatc 


gacgccctgc 


tcgccggccc 


caaacccgcc 


420 


gatctcgtcc 


aggcgctttc 


cctgccggtt 


ccgtccttgg 


tgatctgcga 


actgctcggt 


480 


gtcccctatt 


cggaccacga 


gttcttccag 


tcctgcagtt 


cccggatgct 


cagccgggaa 


540 


gtcaccgccg 


aagaacggat 


gaccgcgttc 


gagtcgctcg 


agaactatct 


cgacgaactc 


600 


gtcacgaaga 


aggaggcgaa 


cgccaccgag 


gacgacctcc 


tcggccgcca 


gatcctgaag 


660 


cagcgcgaat 


ccggcgaagc 


cgaccacggc 


gaactggtcg 


gtctggcgtt 


cctcctgctc 


720 


atcgcggggc 


acgagactac 


ggcgaacatg 


atctcgctcg 


gcacggtgac 


cctgctggag 


780 


aaccccgatc 


agctggcgaa 


gatcaaggcg 


gatccgggca 


agaccctcgc 


cgcgatcgag 


840 


gaactcctgc 


ggatcttcac 


catcgcggag 


acggcgacct 


cacgcttcgc 


cacggcggac 


900 


gtcgagatcg 


gcggcacgct 


catccgcgcg 


ggtgaaggcg 


tcgtcggcct 


gagcaacgcg 


960 


ggcaaccacg 


atccggacgg 


cttcgagaac 


ccggacacct 


tcgacatcga 


acgcggcgcg 


1020 


cggcatcacg 


tcgccttcgg 


attcggtgtg 


caccaatgcc 


tcggccagaa 


cttggcgagg 


1080 


ttggaactcc 


agatcgtgtt 


cgatacgttg 


ttccggcgag 


tgccgggcat 


ccggatcgcc 


1140 
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gtaccggtcg acgaactgcc gttcaagcac gattcgacga tctacggcct ccacgccctg 1200 
ccggtcacct ggtag . 1215 



<210> 69 

<211> 404 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 69 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
1 5 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr .Glu Arg Leu Arg Arg Lys Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin He 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu He Ala Met Asp Pro 

- 85 - 90 95 

Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 

Lys Arg Met Lys Ala Leu Gin Pro Arg He Gin Gin lie Val Asp Glu 
115 120 125 

His He Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 

Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 



51 



WO 2004/061116 



PCTAJS2003/034082 



165 



170 



175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin lie Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Phe Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 
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Pro Val Thr Trp 



<210> 70 

<211> 35 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<220> 

< 2 2 1 > mi sc_f eature 

<222> (20).. (21) 

<223> n=a, c, g or t 



<400> 70 

gttccgcccg tccctcgtcn nsatggaccc gccgg 35 



<210> 


71 


<211> 


35 


<212> 


DNA 


<213> 


Artificial sequence 


<220> 




<223> 


Synthetic 


<220> 




<221> 


misc feature 


<222> 


(15) . . (16) 


<223> 


n=a, c, g or t 


<400> 


71 



cctgcagttc ccggnnsctc agccgggaag tcacc 35 



<210> 72 
<211> 1215 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 72 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg cccgcaaatg cccgttttca- 60 
ccaccgcccg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 
tccggtcaga ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 180 
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ccgcatttca gctccgacca gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 

c 9gcgcgagg acaagccgtt ccgcccgtcc ctcgtcgcga tggacccgcc ggaacacggc 3 00 

aaggccaggc gtgacgtcgt cggggaattc accgtcaagc gcatgaaagc gcttcagcca 360 

cgtattcagc agatcgtcga cgagcatacc gacgccctgc tcgccggccc caaacccgcc 42 0 
gatctcgtcc aggcgctttc cctgccggtt ccgtccttgg tgatctgcga actgctcggt .. 480 

gtcccctatt cggaccacga gttcttccag tcctgcagtt cccgggcgct cagccgggaa 540 

gtcaccgccg aagaacggat gaccgcgttc gagtcgctcg agaactatct cgacgaactc 600 

gtcacgaaga aggaggcgaa cgccaccgag gacgacctcc tcggccgcca gatcctgaag 660 

cagcgcgaat ccggcgaagc cgaccacggc gaactggtcg gtctggcggc gctcctgctc 72 0 

atcgcggggc acgagactac ggcgaacatg atctcgctcg gcacggtgac cctgctggag 78 0 

aaccccgatc agctggcgaa gatcaaggcg gacccgggca agaccctcgc cgcgatcgag 840 

gaactcctgc ggatcttcac catcgcggag acggcgacct cacgcttcgc cacggcggac 900 

gtcgagatcg gcggcacgct catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 960 

ggcaaccacg atccggacgg cttcgagaac ccggacacct tcgacatcga acgcggcgcg 102 0 

cggcatcacg tcgccttcgg attcggtgtg caccaatgcc tcggccagaa cttggcgagg 108 0 

ttggaactcc agatcgtgtt cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 1140 

gtaccggtcg acgaactgcc gttcaagcac gattcgacga tctacggcct ccacgccctg 1200 

ccggtcacct ggtag 1215 

<210> 73 - - 
<211> 404 
<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 73 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 

Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala.Trp Ala Leu 
35 40 -45 
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Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Gin Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin lie 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu Val Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg He Gin Gin He Val Asp Glu 
115 120 125 



His Thr Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Ala 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Asn Tyr Lfeu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 "* 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala Asn Met lie Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys lie Lys Ala Asp Pro 
260 265 270 
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Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg lie Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp lie, 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val Thr Trp 



<210> 


74 . -. 


<211> 


1215 


<212> 


DNA 


<213> 


Artificial 


<220> 




<223> 


Synthetic 


<400> 


74 



atgaccgacg tcgaggaaac caccgcgacc ttgccactgg cccgcaaatg cccgttttca 60 

ccaccgcccg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 

tccggtcaaa ccgcttgggc gctcacccgg ctcgaagaca t-ccgcgaaat gctgagcagt 180 

ccgcatttca gctccgacca gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 

cggcgcgagg acaagccgtt ccgcccgtcc ctcgtcgcga tggacccgcc ggaacacggc 300 

aaggccaggc gtgacgtcgt cggggaattc accgtcaagc gcatgaaggc gcttcagcca 360 
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cgtattcagc 


agatcgtcga 


cgagcatatc gacgccctgc tcgccggccc caaacccacc 


420 


gatctcgtcc 


aggcgctttc 


cctgccggtt ccgtccttgg tgatctgcga actgctcggt 


480 


gtcccctatt 


cggaccacga 


gttcttccag tcctgcagtt cccggtcgct cagccgggaa 


540 


gtcaccgccg 


aagaacggat 


gaccgcgttc gagtcgctcg agaactatct cgacgaactc 


600 


gtcacgaaga 


aggaggcgaa 


cgccaccgag gacgacctcc tcggccgcca gatcctgaag 


660 


cagcgcgaat 


ccggcgaagc 


cgaccacggc gaactggtcg gtctggcggc gctcctgctc 


720 


atcgcggggc 


acgagactac 


ggcgaacatg atctcgctcg gcacggtgac cctgctggag 


780 


aaccccgatc 


agctggcgaa 


gatcaaggcg gacccgggca agaccctcgc cgcgatcgag 


840 


gaactcctgc 


ggatcttcac 


catcgcggag acggcgacct cacgcttcgc cacggcggac 


900 


gtcgagatcg 


gcggcacgct 


catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 


960 


ggcaaccacg 


a t c ccrcra ncrcr 


^tutydgddg ccggdcaccL ccgacaccga acgcggcgcg 


1020 


cggcatcacg 


tcgccttcgg 


attcggtgtg caccaatgcc tcggccagaa cttggcgagg 


1080 


ttggaactcc 


agatcgtgtt 


cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 


1140 


gtaccggtcg 


acgaactgcc 


gttcaagcac gattcgacga tctacggcct ccacgccctg 


1200 


ccggtcacct 


ggtag 




1215 



<210> 75 

<211> 404 

<212> PRT 

<213> Artificial sequence 

<220> • - ... ... - - 

<223> Synthetic 

<400> 75 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
1 5 10 15 

Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 

Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 

Thr Arg Leu Glu Asp He Arg Glu . Met Leu Ser Ser Pro His Phe Ser 
50 • 55 60 
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Ser Asp Gin Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin He 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu Val Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg He Gin Gin He Val Asp Glu 
115 120 125 



His He Asp Ala Leu Leu Ala Gly Pro Lys Pro Thr Asp Leu Val Gin 
130 135 ~ " 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Ser 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Ar Met Thr Ala Phe Glu Ser 
180 185 190 

Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 

- Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 ~ 255 



Thr Leu Leu Glu Asn Pro Asp Gin eu Ala Lys He Lys Ala Asp Pro 

260 . -.65 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 
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Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys. Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val Thr Trp 



<210> 76 

<211> 404, 

<212> PRT 

<213> Saccharopolyspora erythaea 

<400> 76. . _ 

Met Thr Thr Val Pro Asp Leu Glu Ser Asp Ser Phe His Val Asp Trp 
1 5 10 15 



Tyr Arg Thr Tyr Ala Glu Leu Arg Glu Thr Ala Pro Val Thr Pro Val 
20 25 30 



Arg Phe Leu Gly Gin Asp Ala Trp Leu Val Thr Gly Tyr Asp Glu Ala 
35 40 45 



Lys Ala Ala Leu Ser Asp Leu Arg Leu Ser Ser Asp Pro Lys Lys Lys 
50 55 60 



Tyr Pro Gly Val Glu Val Glu Phe Pro Ala Tyr Leu Gly Phe Pro Glu 
65 70 75 80 
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Asp Val Arg Asn Tyr Phe Ala Thr Asn Met Gly Thr Ser Asp Pro Pro 
85 90 95 



Thr His Thr Arg Leu Arg Lys Leu Val Ser Gin Glu Phe Thr Val Arg 
100 105 110 



Arg Val Glu Ala Met Arg Pro Arg Val Glu Gin lie Thr Ala Glu Leu 
115 120 125 



Leu Asp Glu Val Gly Asp Ser Gly Val Val Asp lie Val Asp Arg Phe 
130 135 140 



Ala His Pro Leu Pro lie Lys Val lie Cys Glu Leu Leu Gly Val Asp 
145 150 155 ' 160 



Glu Lys Tyr Arg Gly Glu Phe Gly Arg Trp Ser Ser Glu lie Leu Val 
165 170 175 



Met Asp Pro Glu Arg Ala Glu Gin Arg Gly Gin Ala Ala Arg Glu Val 
180 185 190 



Val Asn Phe lie Leu Asp Leu Val Glu Arg Arg Arg Thr Glu Pro Gly 
195 200 205 



Asp Asp Leu Leu Ser Ala Leu He Arg Val Gin Asp Asp Asp Asp Gly 
210 215 220 



Arg Leu Ser Ala Asp Glu Leu Thr Ser He Ala Leu Val Leu Leu Leu 
225 230 235 240 



Ala Gly Phe Glu Ala Ser Val Ser Leu He Gly He Gly Thr Tyr Leu 
245 250 255 



Leu Leu Thr His Pro Asp Gin Leu Ala Leu Val Arg Arg Asp Pro Ser 
260 . 265 270 



Ala Leu Pro Asn Ala Val Glu Glu He Leu Arg T: .• He Ala Pro Pro 
275 280 285 



Glu Thr Thr Thr Arg Phe Ala Ala Glu Glu Val Glu He Gly Gly Val 
290 295 300 
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Ala He Pro Gin Tyr Ser Thr Val Leu Val Ala Asn Gly Ala Ala Asn 
305 310 . 315 320 



Arg Asp Pro Lys Gin Phe Pro Asp Pro His Arg Phe Asp Val Thr Arg 
325 330 335 



Asp Thr Arg Gly His Leu Ser Phe Gly Gin Gly He His Phe Cys Met 
340 345 350 



Gly Arg Pro Leu Ala Lys Leu Glu Gly Glu Val Ala Leu Arg Ala Leu 
355 360 365 



Phe Gly Arg Phe Pro Ala Leu Ser Leu Gly He Asp Ala Asp Asp Val 
370 375 380 



Val Trp Arg Arg Ser Leu Leu Leu Arg Gly He Asp His Leu Pro Val 
385 390 395 400 



Arg Leu Asp Gly 
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